Data infrastructure optimization, availability & security software
Data integration & quality software
The Next Wave of technology & innovation

Big Data Disaster Recovery Preparation Tips

Preparing to recover Big Data workloads after an unexpected disaster requires more than just having data backups on hand. This article explains how to build an effective Big Data disaster recovery strategy.

Disaster recovery is the process of restoring normal operations after an unexpected event destroys part or all of your IT infrastructure.

All organizations should have a disaster recovery plan in place. However, the importance of disaster recovery is even greater for companies that rely heavily on data to drive their business, and that need to restore data-based operations quickly in order to get back to business following a disaster.

5 Must-Haves for Big Data Disaster Recovery

An effective Big Data disaster recovery plan includes the following…

1. Off-Site Data Backups

Backing up your data to a remote location is the most obvious disaster recovery preparation step. Off-site backups ensure that data will remain unharmed in the event that a physical disaster, such as a fire or a major storm, destroys your production infrastructure.

Perhaps the most important item to keep in mind about off-site data backups is that they are not enough on their own to ensure reliable disaster recovery. See also: Data Backup vs. Disaster Recovery: Yes, There’s a Big Difference

big data disaster recovery, big data backup, HA/DR

2. On-Site Backups

Off-site data backups are the best way to ensure that data will remain available, no matter what type of disaster may strike.

In some cases, however, it may make sense also to keep on-site backups. The advantage of on-site data backups is that data can often be restored more quickly from on-site servers than it can from remote sites – provided, of course, that some of your on-site infrastructure survives the disaster.

3. Big Data Recovery Playbooks

When you’re dealing with an unexpected infrastructure failure, you need a plan in place for guiding all of your actions as you restore data. The last thing you should be doing is figuring things out as you go, or guessing what your next step should be.

This is why developing “playbooks” is so important. A playbook is a set of steps that you write out ahead of time – that is, before a disaster occurs – and follow when recovering from a disaster.

State of Resilience 2018

Your playbooks should be written to be somewhat adaptable, of course, because it’s impossible to predict every challenge you’ll face during disaster recovery. But having playbooks in place will do much to lay the groundwork for quick and efficient disaster recovery.

4. Data Transformation Tools

Moving data from backup locations to production servers can be time-consuming when there is a lot of data involved. It is even more difficult if the data needs to undergo transformations – which is likely the case if, for example, your backup data is stored in one format but needs to be converted to a different format in production.

For this reason, it’s important to ensure that you’ll have good data transformation tools at your disposal during disaster recovery. This may require having backup instances of the tools available in case your production environments are destroyed.

5. Data Capture Continuity

A disaster may destroy your ability to continue capturing data, but it doesn’t stop the data from flowing. During disaster recovery, it’s important to ensure that you maintain continuous capture of data to the extent possible, even if your analytics operations are interrupted.

If possible, have backup storage locations that will become operable in the event that your main storage servers are disrupted. Ensure that your backup locations have enough capacity to handle the amount of new data that will be generated during the time it takes to restore operations – which could be hours, days or longer, depending on the type of disruption you suffer and the extent of your infrastructure.


Heeding the considerations discussed above will ensure not just that you have backup data available in the event that disaster strikes, but also that you can restore production data operations as quickly as possible. Backup data on its own is of little value if you can’t put it to use quickly in a broader Big Data disaster recovery strategy.

Download our 2018 State of Resilience Report to see how organizations are strategizing to sustain severe shocks, protect information, and enable the insight and intelligence required to stay competitive.


Related Posts