How will you measure & manage in a virtual world?

April 26, 2007

Our latest whitepaper examines the challenges and solutions facing virtualization.  Here’s how virtualization technologies are being used to increate the efficiency of the modern data center, some of the problems that have arisen with its use, and how they may be mitigated.


Is Software Measurable?

April 7, 2007

How do you measure anything?  Generally metrology, the science of weights and measures, involves the analysis of a sample against a standard reference model.  In the case of physical measurements the Standard Reference Materials (SRM) are used as the calibration standard.  So then, how do you measure software, which is merely a collection of electronic binary digits (bits) representing programs, graphics, and documentation? 

Everyday, we download or install software onto our laptops, desktops, servers and PDAs with no understanding of what it is we’re actually getting. Mostly it’s a specific software vendor’s reputation or a promising new feature that drives us to blindly point, click and install, update or upgrade our software. It’s interesting to note that all of the methods that control the delivery and usage of that software are covered and controlled by standards developed to span the seven layers of the OSI Model, but when it comes to the measurement of the software at layer seven, for quantitative, qualitative or authoritative results, the user is left with nothing but hope.  Hope that he hasn’t introduced malware or instability into his platform.   

But there are technologies today that can be used to provide proven and effective compact measurements for software.  For example, One-way cryptographic hashing functions, such as SHA-1, easily transform large bodies of bits into definitive short-hand signatures (also called fingerprints) that uniquely represent the original source data.  When the source data is widely distributed, anyone can generate the same cryptographic hash and, if it were available, could compare it against the source hash to verify the authenticity of the data.  A change to any single bit of information would result in a completely different fingerprint, enabling immediate detection of alterations. The main point here is that the sample data must be checked against a known reference in order to determine it validity. 

Anti-virus vendors use similar techniques to generate black-lists, a list of fingerprints for known bad software.  An AV agent collects sample measurements from the software on the target platform and compares them against a reference of undesirable elements. With this comparison certain policies can be triggered such as isolating and/or inoculating the unwanted or malicious code. 

But wouldn’t it be more effective if this method were inverted?  Enable the use of proactive measurement and validation techniques (also called attestation), to sample software and then compare against a known and trusted reference. By using this “white-list” method the policy method for enforcement is grounded in and extended up from a source of known-good values and  would allow only positively identified samples to run in the environment.  

But here are the tricks:   

How is the white list (the standard or trusted reference) derived, maintained, managed and effectively deployed in the IT enterprise?   

How does one minimize or avoid false positives (apparent dangerous is really benign) or false negatives (apparent benign is really dangerous)? 

How is trust “grounded” and normalized, and can we provide the desired zero-knowledge-proof?   

And is the white-list method mutually exclusive from the black list method described above? 

More on these issues in follow on blogs.  Your thoughts are welcome.

The evolution of security and systems management methods

April 4, 2007

As stated in a prior post, the goals of most IT departments are simple: Deploy and manage an agile, secure, reliable and stable global information technology (IT) infrastructure – and manage it with increasing efficiency
Accomplishing these goals can be a challenging endeavor.  Industry and Government are increasingly dependant on IT systems.  And yet, is IT (vendor and customer –side) really up to the task?  There is much evidence suggest that it is not.
Blacklisting, and forms of signature-based filtering and anomaly detection, have traditionally been the de facto standard method for IT device security.  There is now sufficient evidence indicating that these methods have reached the point of maximum, and even diminishing return for many, if not most, IT users.
Current generation “white listing” methods (such as Tripwire), are effective to an extent, but these relative integrity methods leave certain measurement gaps as well.  For example, how do I know that the code on the machine that purports to be authentic release code by vendor XYZ is really their code?  And relative integrity validation can still lead to integrity drift between like systems within the same enterprise.

So, as often happens, the answer to current and future needs can be gleamed from the past.  As the 19th century scientist, Lord Kelvin said: “To measure is to know.” and “If you can not measure it, you can not improve it”

IT systems management must continue its transformation from art to science.  Software measurement is the key – it can, and must be done to close the gaps that we all struggle with around IT security, compliance, scaling, and stability issues.

Disaster Recovery… Why don’t we ever trust it?

April 4, 2007

Over the last 10 years or so, the IT industry has become serious about pushing out better and more reliable applications.  One of the functions of a reliable application is redundancy and disaster recovery.  Many companies have spent billions of dollars “carbon copying” their primary datacenters in the pursuit of fail over capabilities.  But in this pursuit they have found many problems associated with running not just 1 but maybe 3 datacenters in this ever expanding digital world.  Some of these problems are: 

  1. Outsourcing – most large companies will not outsource their DR site to a co-location company for fear that a competitor might have access to their data/assets.  So the cost of running a DR datacenter are nearly the same as running your primary site.
  2. Trusting the DR applications – As applications and operating systems are constantly updated in your Dev, QA, UAT, and Production systems, quite often these updates either are not promoted to or are purposefully kept from the DR environment.  This slowly causes the DR site to become out of sync with Production.
  3. Lack of testing DR systems – I have seen many large companies and government agencies, take downtime of an application or system rather than actually pulling the trigger and bringing the DR site up.  They just don’t have faith in it or have not tested it extensively.
  4. Certification of systems/applications – LOL, need I say more? Can I get a second helping of certification sir?

Bottom line, DR centers are everywhere, and are sitting idle and outdated.  The real solution to an effective DR center is measuring those systems at any given time to what is expected to be available and running.  DR and Production should ALWAYS be identical at least at an application layer.  Having a DR version of a trading application for example that is 1 rev off, and then failing over to it could spell disaster(thus why no one is willing to pull the trigger on a failover).   Centrally managed and deployed changes must be replicated and validated in both Production and DR to allow you to trust that what is in Production is also in DR.  No one wants to run an application that might be dirty.  The real solution is to measure those applications with a trusted source for all stages of an application and system lifecycle from Development to DR and all stops in between.