Over the years, we’ve all become used to seeing disasters, whether natural or man-made, represented in terms of their cost, writes Ian Wells. For instance, Hurricane Sandy’s cost has been estimated at $68 billion USD, while the World Bank put the cost of the 2011 Tohuku earthquake at $235 billion. Yet these numbers do not show either the human cost or the true effect of disaster scenarios on individual organisations.
‘Tens of billions of dollars’ is a statistic: the Sendai Airport flooding, the New York Stock Exchange being closed for two days or The Huffington Post being taken offline for a large part of the run-up to the US Presidential Election are all far more tangible examples of the potential damage such disasters can cause.
Regardless of the severity of a disaster, these events can provide valuable lessons for the modern business.
To begin with, downtime is no longer an option. The modern business is a 24/7 operation wherein services must be constantly available. At the same time, any disaster recovery process is only as strong as its components. As The Huffington Post showed by suffering multiple failures in its data protection infrastructure, even a supposedly resilient system can be overloaded in the wrong circumstances.
Reliance on Information Technology
The threat from disasters has increased as modern businesses become increasingly reliant on IT. In such cases, a disaster need not affect buildings and other infrastructure at all.
For instance, failures of Amazon’s AWS and EC2 cloud infrastructure affected businesses in a wide variety of industries, cutting them off from their websites and other services for hours at a time (or even days).
Indeed, a virus or simple human error could render IT services unusable and so cripple a business without leaving so much as a scratch in the real world.
This reliance on IT doesn’t change the fundamental aim of disaster recovery: to ensure that work can resume almost where it left off before the disaster scenario occurred.
For real world disaster recovery locations, inspecting the location and ensuring that everything functions correctly can be a relatively rare although time-consuming and costly process. Since the factors involved will not change greatly over time, an organisation only needs to ensure it has separate locations that should be unaffected by a disaster and that will provide the services it requires in order to function.
The Elephant in the Room
This approach to disaster recovery testing simply will not work for modern IT services. Data and applications change so rapidly, on a daily, hourly or even faster basis such that being certain data can only be recovered from a month or a week ago is unacceptable.
Traditionally, testing whether IT services can be recovered in the event of a disaster has been sporadic at best. Research conducted by Vanson Bourne in 2013 found that enterprises only test 7.4% of their data back-ups at a time, and only perform that testing every three months due to the time and effort required. This means that, regardless of a disaster’s effect on physical infrastructure, the business may still find itself bereft of critical services.
With the technology now available this really shouldn’t be the case. Indeed, IT should be making disaster recovery more successful across the board.
For instance, more and more IT functions are now automated and there’s no reason why testing should not be included here. Improvements in computing power and the rapidly shrinking costs of IT storage mean that testing can also be performed daily – or even more frequently – without impacting ongoing IT services.
This increased availability and flexibility of IT resources can also increase the resilience of data protection. For example, it means that setting up multiple back-ups of data is quite affordable.
Essentially, even if both an organisation’s primary site and its back-up data are destroyed, the business will still have secondary or even tertiary back-ups upon which to rely.
The growth of cloud computing also makes it easier to provide IT services for almost any location meaning. That being the case, as long as the IT services are safe the business can continue to function.
Using IT to support disaster recovery
An example of using IT to support disaster recovery can be seen at Catalent, the worldwide drug development, delivery and supply company. When one of its Japanese facilities was directly threatened by power outages in the wake of the Tohuku earthquake, Catalent could immediately transfer the virtual IT infrastructure at that site to a second, unaffected site in Japan, in turn continuing to operate with no disruption to IT services.
Less than a fortnight later, Catalent’s site in Corby, Lancashire was destroyed by fire. This site housed vital servers running inventory, production and shipping applications. Since the company had regularly backed up these servers, and those back-ups were tested to be sure they’d recover, the organisation was able to quickly re-establish operations at a second site with no loss of data.
Whether man-made or acts of God, disasters are essentially random in nature. Sometimes, only sheer luck will separate a swift recovery from lengthy downtime.
However, the odds should be stacked as far in a given business’ favour as possible. Recognising that evolutions in IT have changed the way in which disaster recovery works – and complementing regular testing of physical disaster recovery procedures with methods to ensure that IT will always function – will help lessen the chance that a modern, ‘always-on business’ is crippled by disaster.
Ian Wells is Vice-President for North West Europe at Veeam