Disaster Recovery Strategies For Big Data

As big data is becoming a mainstream business, organizations today are exploring this avenue to transform their business and provide their clients with insightful business intelligence. They have learned that the big data is not just a single technology, technique or an initiative. Instead, it is a trend that’s across many areas of business and technology. Using this technology they have started defining new initiatives and are in the process of reevaluating existing strategies.


The sudden growth in the data has made companies change the way they access mission-critical information, deploy applications, and approach data protection in general. So now the concern is whether these companies have a disaster recovery plan in place to recover from a catastrophic outage. A disaster recovery plan (DRP) can help get systems back online quickly and efficiently. This plan documents the procedures required to recover critical data and IT infrastructure after an outage. Here are some key points to consider while deciding the strategies for disaster recovery:

  • Setup & finalized RPO(Recovery Point Objective)

The recovery point objective (RPO) is the interval that equates to the maximum acceptable amount of data loss after an unplanned outage occurs. The RPO is so crucial because it determines the frequency at which you need to back up your data. We need to ask yourself how many hours of data loss as a result of systems going down can you manage or afford to lose. If your RPO is two hours of data, then we need to perform backups every two hours. All the management executives and the entire team needs to agree upon the RPO.

  • Offsite backups

The most obvious disaster recovery step is to keep the data stored in a remote location. Off-site backups ensure that data will remain unharmed during unforeseen situations like a natural calamity, fire etc. For Big Data, cloud-based backup is probably the best option because it is cheap and easy to backup your data to the cloud, particularly batch data, which is large and static.  

  • Conduct recovery tests regularly

In order to be confident about a disaster recovery plan, testing the plan is of utmost importance.  Best and safest approach is to test at least semi-annually for newly implemented disaster recovery plans and then yearly in future. The tests you conduct should verify that your disaster recovery procedures can restore Big Data workloads to meet the predefined RPO.

  • Use data recovery tools

It is important to ensure that you have good data recovery tools at your disposal during disaster recovery. This may require having backup instances of the tools available in case your production environments are destroyed.

  • Have continuity in data gathering

Disaster once occurred will not stop the data flow. During disaster recovery, you need to make sure that the data is captured continuously although your operations are not working. Make sure that your backup storage locations have enough spare capacity to manage new data which is generated at the time of restore operations.

In the rapidly changing environment, where organizations are prone to threats, it is vital that you have a robust and well-tested disaster recovery plan in place. We say that “Prevention is better than cure” and hence securing & backing up data using new technologies should be of utmost importance for the organizations.


Leave a Reply

Your email address will not be published. Required fields are marked *