Planning the best means preparing for the worst

Fri, 1st Jul 2011

FYI, this story is more than a year old

Making sure organisations have up-to-date and accurate information about the changing feast that is disaster recovery presents a challenge to resellers whose customers are literally betting their business on the advice they get.Disaster recovery and business continuity planning (DRBCP) is not some arcane add-on to existing ICT systems but a series of technology and people processes which ensure you still have a business when the dust settles.As well as having some form of uninterruptible power systems (UPS) to smooth out supply, allow an orderly shut down or hand over to a generator, databases, email, customer records and other essential data needs to be backed up.Traditionally data was stored on magnetic tape but that’s rapidly shifting to digital disk arrays in storage area networks (SANs) or network attached storage (NAS), sensibly replicated remotely in data centers. The use of cloud computing and virtualisation, replacing older servers with a few powerful multi-tasking machines, continues to change the game, adding complexity and uncertainty in an area where businesses expect the exact opposite. Disaster recovery is not a hugely competitive area, as maintaining on-line real-time facilities across multiple locations requires deep pockets, industrial level resilience and military level security.Maintaining operationsTypically, the software, systems and support are offered by major players including IBM, Hewlett Packard and Gen-i, alongside systems integrators who have their own partnerships. Increasingly the top internet service providers have their own data centers for clients as well as delivering software as a service (SaaS) and server co-location.IDC describes DRBCP as the ‘ability to have continuity and maintain operations and services in the face of a disruptive event’. While IDC’s New Zealand senior services analyst Rasika Versleijen-Pradhan says most larger and mid-tier organisations have business continuity strategies, since the events in Christchurch, there’s a heightened awareness about the importance of DRBCP and a renewed focus on auditing and testing back-up and recovery systems to determine how long it might take to get businesses back on track post-disaster, and whether that time lag is acceptable."The opportunity is not so much to educate the market but to ensure the budget is set aside for this,” says Versleijen-Pradhan.She advises resellers, vendors and consultants to take a whole of business approach and ensure they’re talking to top management, with buy-in from all departments at the highest possible level.Clients need to be able to discuss with vendors and service providers the different levels of tests and layers of recovery, particularly around mission critical applications such as finance or product and services delivery."All lines of the business need to be aware of what is being proposed and how it will work, to avoid anything being taken for granted.”Spreading the risk Any outage could potentially impact the whole business and supply chain so companies need to establish best practice to ensure minimal disruption with service level agreements (SLAs) in place with vendors around those capabilities.Versleijen-Pradhan says it’s not only critical for resellers to be assessing the DRBCP status of potential clients but to talk about their own continuity systems and where the market is moving to. "The last thing you want is to tarnish your company’s reputation and that of your client by having downtime through some outage that you should have had under control.”A recent IDC survey of 100 organisations in New Zealand revealed that disaster recovery was among the top priorities when improving or designing a data center.One of the keys is spreading the risk across more than one data center or region to increase client confidence."The Government, for example, is adamant it wants to have local data centers spread across certain geographic areas in New Zealand with certain distances between them to make sure it can continue business as usual regardless of outages or disasters.”Security still a concernHowever Graham Titterington, principal analyst with Ovum Australia, warns the switch over from the primary site and facilities to disaster recovery facilities can present an area of vulnerability. "It is a typically a stressed time and a little chaotic, giving attackers an opportunity to penetrate the confusion – particularly if the disaster was deliberately triggered.”Disaster recovery requires multiple copies of data to be created but Titterington says close attention to processes is critical to ensure vulnerabilities are not exploited. And while backup data is usually encrypted at the whole media level, the process of restoring the data after an incident can be defeated if you can’t locate the encryption key."It is vital to ensure the encryption key is available when needed, which usually means storing it on the same media as the back-up data which reduces security to simply requiring a password to release the key.”Although the move to cloud and virtualised systems offers efficiencies and cost savings, the implications for disaster recover and business continuity are still being worked through. In late 2010, Symantec, warned 47% of virtualised servers were not covered in existing data recovery plans and a half the data on virtual systems was not regularly backed up.Clouded by complexity In fact Symantec suggests virtualisation and cloud technologies add complexity to disaster recovery initiatives with a big gap between downtime expectations and reality.Two-thirds of Australasian data center managers having problems protecting and managing mission critical applications and data in virtual environments said the use of multiple tools presented a major challenge.While the majority of local businesses ran approximately 50% of their mission-critical applications in the cloud, 94% still had security concerns. The biggest challenge for disaster recovery was the ability to control failovers and make resources more highly available. Recovering from an outage typically took twice as long as expected. If the main data center was destroyed most expected to be up and running within two hours, half that of the 2009 survey. However in reality the average downtime, across an expected six annual incidents, was six hours. The major causes for downtime were system upgrades, power outages and failures, cyber attacks and natural disasters. IDC’s Versleijen-Pradhan agrees cloud-based services for servers and storage along with virtualisation and SaaS require a whole different set of disciplines. "We’re moving into an always-on economy of where DRBCP becomes all the more relevant. The whole movement toward the cloud is getting more dynamic and that means infrastructure has to be more flexible.” She says the ability to integrate across the whole area of convergence, including mobility, is as important to the virtual eco-system as managing and getting it back up and running quickly. Eco-system recoveryMeanwhile data center providers are realising that it is critical to offer vendor independent services and that customers want a relationship that is as close as other partners in the supply chain. To that end they’re increasingly demanding more information about where their data is being housed and how their service provider approaches their own DRBCP. While that might be possible in the private cloud, Ovum’s Titterington says most public cloud providers cannot give any assurance about the geographical location of back up sites, even if their primary site is local. "This may be a security or compliance issue in itself. A recent survey by Trend Micro has shown that data breaches in the cloud are occurring at an unacceptable rate and that fears about cloud security are not as theoretical as we had assumed,” he says.Ovum’s 2011 Trends to Watch report says the volume of data stored by organisations in the cloud continues to soar; in some cases by as much as 50% a year, with a growing demand for higher level access. In fact storage has become one of the fastest evolving areas of IT, with vendors moving into deduplication — which strips out double ups — and other techniques to cut costs and disk consumption.Further complicating matters is the use of server virtualisation, which demands more flexible and integrated storage media that can better handle the intense traffic loads. Potential solutions include high speed flash storage, expected to become ‘a pervasive new tier of storage’ when prices are forced down. And while the public cloud has not generally been suited to the enterprise, recent breakthroughs will help remedy this. An emerging breed of service providers is attaching an enterprise friendly front-end to public cloud storage, creating better performing options for disaster recovery and the storage of non-performance sensitive data.The challenge remains for resellers and vendors to ensure their clients are well informed and have budget set aside to move business continuity to the next level. In the interim, heightened awareness around disaster recovery presents an opportunity for auditing and testing existing capabilities.Where possible the management of physical, virtual and cloud environments should be through an integrated suite of tools rather than multiple applications. As Symantec recommends, everything in the back-up and recovery process should be prioritised and automated and training undertaken to ensure people and processes align in an overall organisational plan.Even when the right recovery tools are in place, if workers don’t know how to re-establish their daily routines from a remote site or home base, you simply move from one kind of disaster to another.