Disaster Recovery in Cloud Computing: Replication-Based Technologies

Replication-based technologies offer the promise of capturing a data set at a particular point in time with minimal overhead required to
capture the data or to restore it later. There are four main methods of interest in today's storage environments:
● Whole-file replication copies files in their entirety. This is normally done as part of a scheduled or batch process since files copied
while their owning applications are open will not be copied properly. The most prevalent use of this technology is for login scripts or other
files that don't change frequently.
● Application replication copies a specific application's data. The implementation method (and general usefulness) of this method
varies dramatically based on the feature set of the application, the demands of the application and the way in which replication is
implemented. This model is almost exclusively implemented for database-type applications.
● Hardware replication copies data from one logical volume to another and copying is typically done by the storage unit
controller. Normally, replication occurs when data is written to the original volume. The controller writes the same data to the original
volume and the replication target at the same time. This replication is usually synchronous, meaning that the I/O operation isn't considered
complete until the data has been written to all destination volumes. Hardware replication is most often performed between storage devices
attached to a single storage controller, making it poorly suited to replicating data over long distances. Most hardware replication is built out
of SAN-type storage or proprietary NAS filers.
● Software replication integrates with the Windows® operating system to copy data by capturing file changes as they pass
to the file system. The copied changes are queued and sent to a second server while the original file operation is processed normally
without impact to application performance. Protected volumes may be on the same server, separate servers on a LAN, connected via
storage-area network (SAN), or across a wide-area network. As long as the network infrastructure being used can accommodate the rate of
data change, there is no restriction on the distance between source and target. The result is cost-effective data protection.

To best understand how to protect data, it's important to consider what the data is being protected from. Evaluating the usefulness of
replication for particular conditions requires us to examine four separate scenarios in which replication might lead to better business
continuity:
● Loss of a single resource - In this scenario, a single important resource fails or is interrupted. For example, losing the web server that
end-users use for product ordering would cripple any agency that depends on electronic procurement. Likewise, many agencies would be
seriously affected by the loss of one of their primary e-mail servers. For these cases, some agencies will investigate fault-tolerant
architectures, don't invest in fault-tolerance technology for file and print servers-even though the failure of a single file server may
simultaneously prevent several departments' employees from accessing their data. Planning for this case usually revolves around providing
improved availability and failover for the production resources.
● Loss of an entire facility - In this scenario, entire facilities, and all of their resources, are unavailable. This can happen as the result of
natural disasters, extended power outages, failure of the facility's environmental conditioning systems, and persistent loss of
communications or terrorist acts. For many agencies, the normal response to the loss of a facility is to initiate a disaster recovery plan and
resume operations at another physical site.
● Loss of user data files - This unfortunately common scenario involves the accidental or intentional loss of important data files. The most
common mitigation is to restore the lost data from a backup, but this normally involves going back to the previous RPO - often with data loss.
● Planned outages for maintenance or migration - The goal of planned maintenance or migrations is usually to restore or repair
service in a way that's transparent to the end users.

Disaster Recovery in Cloud Computing

Wednesday, July 21, 2010

Replication-Based Technologies

No comments: