Table of contents:
What data should I retain and for how long?
What type of media should I use? Can I adopt a tiered approach?
How can I prove that my data has not been tampered with?
Do I need to make big investments in storage to be compliant?
That's a good question because there are so many laws and regulations affecting data. It'd be easy to say "It depends" but there's a grain of truth to that answer because there are multiple kinds of information retention requirements: by vertical industry, with, for example, finance having different regulations to healthcare, but also on a local/national and international basis.
Companies need to be aware of the regulations in force in the places they work and where their information ends up. For example, if a company is delivering services in another country, it may be subject to that country's regulations. Another example would be if a company is using services from a company in another country and transferring information to that country. We've seen that in software development, where an organisation is taking a snapshot of data to be used for test data which is potentially real customer data and so is subject to protections at both ends.
To summarise, these are business issues, not IT issues. It's important therefore that IT doesn't go it alone and you ask the business and the business's lawyers what information needs to be retained.
There are all kinds of conflicting requirements. To give some examples, in HR some personnel records need to be retained up to an age of 75. Meanwhile, other data protection laws, such as the payment card industry standard used in financial services, can stipulate "no longer than is absolutely necessary" for card-related information. There are going to be scenarios where conflicts could arise between one regulation that needs information to be retained a long time and another that requires it to be retained for no longer than absolutely necessary.
Organisations are enhancing storage capabilities across the board, but interestingly, do not tend to see tiering as a compliance mechanism. Rather, it's archiving from any storage tier which is seen as a valid approach – it's the archiving that gives the compliance, not the tiering per se, though of course one can archive onto another storage tier.
In terms of media, it depends on retention requirements. We can look to how many years it needs to be stored, anti-tampering requirements and discovery criteria and take things from there. This can boil down to striking the balance between the cost of media and the need for fast access to data, for example to respond to discovery requests.
It might be worth just saying a quick word about discovery. This is a term that's front of mind for many US organisations, for which litigation is a frequent concern. Elsewhere, that problem doesn't go away but just as common is the need to find and report on information for internal purposes. . .for example, requests from HR.
Some of these discovery processes can be quite difficult. From our research, more than 50% of those with experience of legal discovery reported that it was "a bad experience," which shows how difficult it can be to collate all the types of information required. For internal actions the figure was 30%, which is better, but shows it's not going to be a very healthy experience for many.
Regarding media, for situations that demand speed over cost, VTL may well be seen as the more appropriate. By cost, this often equates to the time taken by lawyers to find the necessary information – lawyers are never cheap. We know that some records – for example, patient records - sometimes need to be seen as quickly accessible at all times, which again precludes offline storage. In the case of drugs trials, the information on specific patients and drugs may need to be retrieved as soon as possible. Where offline access is acceptable, tape is the most cost-effective medium. And in some cases, a good halfway house can be reached using optical disk. Different regulations may specify certain media. For example the WORM nature of optical disk makes it appropriate when records are not to be tampered with, as in financial records.
We've already mentioned WORM as a mechanism. The latest technology in optical is Ultra Density Optical, which can currently support up to 30 GB [per disk] and is reputed to last over 50 years, though how they know this with it only having been around for two years is beyond me.
While there are technologies designed to prevent against tampering, it's not always possible to prevent against destruction of information. Business continuity technologies such as replication and failover can help here. But clearly they need to be taken into account as part of the compliance architecture.
While there is a place for compliance technologies by themselves, compliance should, of course, be a factor in the storage business case. We can consider this in terms of two kinds of technology that exist in storage: infrastructure and information management. From the infrastructure perspective, questions are more around whether the platform can be compromised, who has access to what and so on. So, for example, can the storage administrator access information that's on the storage platform? Meanwhile, at the higher level, information management solutions are more around how to get around the data, finding information that is needed for the job, or responding to discovery requests.
We've already talked to some extent about the kinds of technologies available for managing data in a compliant way. There are overlapping, but separate technologies that can support efficient discovery. There is no one class of technologies we could call compliance technologies and as a result they are available from a wide variety of information management vendors. So, for example, we have business intelligence companies such as Autonomy with Zantaz, who are able to intelligently store, archive and search as part of compliance or as part of general business activity.
There is a big 'however' to all this. The higher up the technology stack you go, the more it is necessary to involve the business in the decision-making process. We've seen this in the data classification requirements required to help ILM work. As ILM works best when information is classed according to business value, it makes sense for the business to be involved in what is valuable and what isn't. Such dialogue can be used as the basis for understanding what data retention regulations need to be kept to and what impact these might have on certain types of data.
As we've already discussed, these are very much business decisions and not IT decisions. The IT department exists only to support the business meet its own compliance and other data retention requirements. Which brings us back to where we started – that retention is a business requirement, not an IT requirement, and it therefore requires a high level of business involvement to get it right. We know it is organisations that have that dialogue between IT and the business that do a better job in meeting compliance requirements as well as their requirements as a business.
Jon Collins is service director of Freeform Dynamics. He is responsible for looking after the company's portfolio of services and alliances strategy. Jon is also an industry analyst, IT consultant, network manager and software engineer. Jon has accumulated significant real-world expertise and experience in many areas of IT service delivery.
Jon has worked as an industry analyst since 1999 and is a widely published author. He has also acted as an advisor to leading vendors.
This was first published in November 2008