« School volunteering project | Main | Web 2.0 ... The Machine is Us/ing Us »

A change is as good as it's test

We achieve excellent availability of our systems, and whenever we do have an outage the first question is 'what's changed?' Most of the time, we then find the outage was caused by one of two things. First, our ITIL based change management process wasn't followed, or second, and more subtly, it was, but the final tests that were defined and applied didn't pick up the issue that caused the outage. So how do we ensure we define better final tests of changes to production systems? By no means exhaustive, but here's a rough checklist for defining tests to ensure changes have worked:
1 - chances are the change will work - it's what else it breaks in the process you need to worry about.
2 - don't second guess the people that use the system - work with them to define and agree the tests.
3 - the test should be very clear on timing of the tests relative to when the change is done. Don't change in the evening and then test at 7.45 the next morning if your call centre opens up at 8.00am...
4 - try to use separate people to define the test, apply the change and test it. There's a slight conflict of interest. End users are a useful source of testers.
5 - assume the change will fail the test, and have a plan for extricating yourself from the fine mess you have now created.
6 - make sure you've got people lined up to fix the problem or apply the escape plan. There's nothing worse than knowing it's gone wrong but not being able to get hold of the best person to fix it.

TrackBack

TrackBack URL for this entry:
http://www.computerweekly.com/cgi-bin/mt/mt-tb.cgi/9954

Post a comment

About

This page contains a single entry from the blog posted on August 8, 2007 5:50 PM.

The previous post in this blog was School volunteering project.

The next post in this blog is Web 2.0 ... The Machine is Us/ing Us.

Many more can be found on the main index page or by looking through the archives.