Contributed by Andrew Whitehead

Alternate Site Disaster Recovery Techniques

In considering alternate site disaster recovery, the two main issues are the reconfiguring or rebuilding infrastructure, and moving data between the primary site and the alternate site.

Continuous Availability Disaster Recovery

Continuous availability means real time data replication, and is arguably not really a disaster recovery technique as no recovery is actually made - the data is never lost. This approach relies on continuous data transfer, which can be achieved in several ways.

One way is to use a transaction router, usually a software-based solution that simultaneously routes transactions to both primary and alternate locations. The transaction routing server itself must be secure, and multiple servers are often used. While offering a very high degree of disaster recovery, it is not appropriate in all cases, is proprietary in nature, and does not work well on a large scale.

A storage controller offers disaster recovery by replicating data on a volume-by-volume basis, moving it from a storage array on the primary site to an identical array on the alternate site. Since this is simply data storage, it is suitable for large scale operations and is independent of applications and operating systems. It is a very popular disaster recovery technique, especially used in conjunction with hot servers.

Hot Server Disaster Recovery

Hot standby servers technique focuses on being able recover and reconfigure servers at a

remote site as quickly as possible if any server on the primary site fails. This is called a 'failover' and is usually done automatically. It must be supported by a good data replication technique, so that on restart the necessary data is present. This can be achieved by remote tape vaulting, where duplicates of tapes in the primary location tape library are housed in an automated tape library at a remote site. This has the advantage that no time is lost in locating or transporting tapes.

Warm Server Disaster Recovery

In this system, the servers at the alternate location have operating systems and applications loaded and running and established network connections to the production network, ready to take over from a failed server at the primary site. These servers can be used for non-critical tasks such as backups, application

development, or quality assurance, requiring only minor reconfiguring to switch to production support.

Cold Server Disaster Recovery

Cold server recovery is the most basic method, and involves staff, tapes, etc. moving to a recovery facility to begin the rebuilding process. This is slow, labor intensive and often unsuccessful. Firstly the servers have to be restored from tape, and once that is done data also has to be restored from tape, taking up a tremendous amount of time.

Because of this, cold server recovery can be used with off-site tape warehousing, in which duplicate backup tapes are physically moved to a remote warehouse. Should the tapes be needed, they are taken out of storage and transported to the recovery site. This is obviously a slow process, but as it can be done simultaneously with the staff moving and the servers being restarted it is acceptable.

