Disaster Recovery… Why don’t we ever trust it?

Over the last 10 years or so, the IT industry has become serious about pushing out better and more reliable applications.  One of the functions of a reliable application is redundancy and disaster recovery.  Many companies have spent billions of dollars “carbon copying” their primary datacenters in the pursuit of fail over capabilities.  But in this pursuit they have found many problems associated with running not just 1 but maybe 3 datacenters in this ever expanding digital world.  Some of these problems are: 

  1. Outsourcing – most large companies will not outsource their DR site to a co-location company for fear that a competitor might have access to their data/assets.  So the cost of running a DR datacenter are nearly the same as running your primary site.
  2. Trusting the DR applications – As applications and operating systems are constantly updated in your Dev, QA, UAT, and Production systems, quite often these updates either are not promoted to or are purposefully kept from the DR environment.  This slowly causes the DR site to become out of sync with Production.
  3. Lack of testing DR systems – I have seen many large companies and government agencies, take downtime of an application or system rather than actually pulling the trigger and bringing the DR site up.  They just don’t have faith in it or have not tested it extensively.
  4. Certification of systems/applications – LOL, need I say more? Can I get a second helping of certification sir?

Bottom line, DR centers are everywhere, and are sitting idle and outdated.  The real solution to an effective DR center is measuring those systems at any given time to what is expected to be available and running.  DR and Production should ALWAYS be identical at least at an application layer.  Having a DR version of a trading application for example that is 1 rev off, and then failing over to it could spell disaster(thus why no one is willing to pull the trigger on a failover).   Centrally managed and deployed changes must be replicated and validated in both Production and DR to allow you to trust that what is in Production is also in DR.  No one wants to run an application that might be dirty.  The real solution is to measure those applications with a trusted source for all stages of an application and system lifecycle from Development to DR and all stops in between.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: