We all have a few digital assets that we would hate to lose. For me, it is all the pictures and videos of my kids that I have taken over the years. I take a few more pictures every day, so I need to be continuously protected. If I were a corporation, and this was my critical data, a foolproof Disaster Recovery policy would be necessary. A DR plan protects my data and business if disaster strikes – whether by natural or man-made causes. For my pictures, this disaster could be in the form of a virus, a faulty hard drive, or my basement flooding.
In any Disaster Recovery and Business Continuity plan, it is important to take several factors into consideration.
- Data – How much can you afford to lose? One day, one hour, or none at all? DR professionals call this your RPO, or recovery point objective. If you backup your data nightly, and your computer crashes right before the nightly backup, you would lose everything you have done for that day and you would need to restore your data from the previous nightly backup.
- Time to Recover – Even if you have all your data and don’t lose a single transaction (or picture, in my case), you still may not have access to the applications, network, or servers to operate. How many minutes, hours, or days can you afford to be without your computer systems and critical applications? This is your RTO, or recovery time objective. RTO may be different depending on the application. For an e-commerce application, the RTO will likely be very short – you will lose revenue and customers if your site remains down.
- People and Space – Where will your people work? What if certain key employees are personally affected by the disaster? In my case, if my basement were to flood, could I move a new computer into the kitchen? Not if my wife had anything to do with it.
- Cost – Instantaneous DR with zero RTO and zero RPO may be unachievable, and as you get closer to this ideal, the costs rise exponentially.
- Testing – if you put a plan in place, how difficult is it to test? Will your plan work as architected? As I type this, I realize that I have never actually tested to see if I could recover all my pictures and videos. Testing is often postponed or ignored.
For my personal computer and pictures, I have an RPO of 24 hours. I need to know that I could, at a worst-case scenario, lose 24 hours of data. I have a pretty liberal RTO – Recovery Time Objective. If my home computer goes up in smoke, having a new one up and running within a week or two would be fine (as long as I still have my pictures).
For my personal pictures and videos, I used to have to back up everything on DVDs and then make sure these were stored in a safe location. However, my system was never very secure. If my house had burned down, all my backup DVDs would have been destroyed as well. Corporations have traditionally had similar concerns. A typical DR solution might have been something similar to the following:
- Redundant hardware in a data center (at double the expense) made certain that a single failure did not wipe out systems and data.
- Nightly backups and storing tapes at an off-site location protected data.
- DR contracts were signed with providers of hot or cold sites to bring critical systems back up after a disaster. SunGard and IBM were leaders in providing such offerings to enterprise accounts.
Although this strategy has been effective, there are numerous challenges with this approach. The recovery time can be days, testing is time consuming and expensive, and tapes can be unreliable, lost, or corrupted. Providers may have issues running your applications on their hardware – assuming it is even available in a disaster.
Over the past few years, the Cloud has totally changed the DR landscape. Individual consumers immediately saw the value of getting rid of backup processes that they probably never did anyway. Today, all my pictures and videos are stored in the Cloud, with two different providers! Consumers have many cost-effective and reliable solutions for backing up and recovering home PC applications and data.
Enterprises, of course, have been slower to move. However, continued price drops and innovation have made Cloud technologies nearly impossible to ignore. With Amazon Web Services, the value proposition for Disaster Recovery is especially compelling:
- Data is stored in three geographically disperse data centers, allowing for 99.999999999 percent data durability. Putting this in perspective, if I stored my 10,000 pictures in AWS, I would only potentially lose 1 picture out of that 10,000 over the course of 10 Million years.
- Your server images can also be protected in multiple data centers. Server instances can be set up on Amazon’s “Pilot Light” system. Just like with your water heater, you are only paying for a very small percentage of the total capacity. In a disaster, you can “fire up” all your servers and be up in running in minutes—a fraction of the time it would take in a traditional recovery center.
- If you have applications that need to be backed up instantaneously, an active-active architecture can be set up in AWS. If disaster strikes, systems can immediately failover to systems in the Cloud. Finally, you can decide on an RTO by application. Recovery in minutes using Pilot Light may be fine for most of your applications, whereas others can be set up as active-active, allowing for nearly instantaneous failover.
- Finally, there is no contract! Get your system architected correctly, then pay as you go, only paying for full capacity if you have a disaster. Interestingly, companies have failed over to AWS and then never rebuilt their on-premises infrastructure. They found the reliability, flexibility, and cost of AWS to be superior to their previous in-house infrastructure. It took a disaster to figure this out.
In summary, the Cloud is now an incredible tool to store critical data and to recover critical systems if a disaster strikes. Enterprises of all sizes should start looking at how they can reduce costs and complexity of current DR plans, and how they can utilize the Cloud for a more affordable and reliable alternative. Be prepared – and good luck in your move to the Cloud.