How to create a data classification policy

Michael Cobb explains how to bring data classification awareness across an entire enterprise.

A data classification policy or data classification project is often driven by a need for better storage utilisation and improved indexing and search capabilities.

Since the Cabinet Office's Data Handling Review, however, the Information Commissioner's Office (ICO) has been given new powers to carry out spot checks and fine offenders; a breach of any of its eight data protection principles is now a criminal offence. The consequences have galvanised many organisations into looking at data classification as a means of managing and safeguarding access to its information. This tip looks at how to construct data classification policies.

Unfortunately, many data classification projects fail or disappoint because they're overly complex and too ambitious or difficult to achieve. In this article, I want to look at how to tackle some of the issues and challenges an organisation will face when introducing a data classification policy.

Bringing data classification awareness to the company
For any IT initiative to succeed, particularly a security-centric one such as data classification, it needs to be understood and adopted by management and the employees using the system. Changing a staff's data handling activities, particularly regarding sensitive data, will probably entail a change of culture across the organisation. This type of movement requires sponsorship by senior management and its endorsement of the need to change current practices and ensure the necessary cooperation and accountability.

The safest approach to this type of project is to begin with a pilot. Introducing substantial procedural changes all at once invariably creates frustration and confusion. I would pick one domain, such as HR or R&D, and conduct an information audit, incorporating interviews with the domain's users about their business and regulatory requirements. The research will give you insight into whether the data is business or personal, and whether it is business-critical. This type of dialogue can fill in gaps in understanding between users and system designers, as well as ensure business and regulatory requirements are mapped appropriately to classification and storage requirements. Issues of quality and data duplication should also be covered during your audit.

Categorising and storing everything may seem an obvious approach, but data centres have notoriously high maintenance costs, and there are other hidden expenses; backup processes, archive retrieval and searches of unstructured and duplicated data all take longer to carry out, for example. Furthermore, too great a degree of granularity in classification levels can quickly become too complex and expensive.

There are several dimensions by which data can be valued, including financial or business, regulatory, legal and privacy. A useful exercise to help determine the value of data, and to which risks it is vulnerable, is to create a data flow diagram. The diagram shows how data flows through your organisation and beyond so you can see how it is created, amended, stored, accessed and used. Don't, however, just classify data based on the application that creates it, such as CRM or Accounts. This type of distinction may avoid many of the complexities of data classification, but it is too blunt an approach to achieve suitable levels of security and access.

One consequence of data classification is the need for a tiered storage architecture, which will provide different levels of security within each type of storage, such as primary, backup, disaster recovery and archive -- increasingly confidential and valuable data protected by increasingly robust security. The tiered architecture also reduces costs, with access to current data kept quick and efficient, and archived or compliance data moved to cheaper offline storage.

Create user roles, monitor data handling activities
As part of any data classification policy undertaking, you need to decide who can access what data as it is used. Access to protectively marked material should be based on a user's...


  • Role
  • Trustworthiness
  • Training in data handling activities

When it comes to accessing classified data, users' roles are more important than their seniority. Just because someone is "director of R&D" doesn't mean he or she should have access to payroll data, for example. No staff members should be allowed to access classified data until they have had adequate data handling and data privacy training.

Staff must be given training in the principles behind any data classification initiative. The data classification policy must give clear guidance to all staff members -- this includes contractors and relevant third-party suppliers -- on how their role relates to the policy and the Data Protection Act. Aside from knowing which data requires a sensitive classification and which data can become sensitive when combined with other pieces of valuable data, a staff must also understand the following key points:

  • Users producing information are responsible for determining the classification and applying any appropriate labels and release marking.
  • Access to protectively marked assets is only granted on the basis of the 'need to know' principle.
  • Deliberate or accidental compromise of protectively marked material may lead to disciplinary proceedings.

Classified data should not be accessible to anyone who has shown a lack of reliability through dishonesty, lack of integrity, or who may be subject to improper influence due to personal circumstances. Security concerns may require HR to carry out additional checks on staff, such as a Criminal Records Bureau (CRB) check. To control access to your data, users will need to be identified, authenticated and then authorized. The greatest current weakness in authenticating users is the continued use of passwords as they aren't robust enough; users share them and all too often choose ones that are easy to guess or crack. Access to classified data really needs to be controlled using strong, or two-factor, authentication, whereby a second factor, such as something the user has or something the user is, is required in addition to something the user knows, like a password.

Data sharing is crucial to any modern technology strategy, and data classification can ensure both business and compliance requirements are met effectively and efficiently. Data classification should be the cornerstone of any data handling activities and information lifecycle management policies, ensuring proper and secure storage and use.

There are no set rules to data classification, but the government's Protective Marking and Asset Control system discussed in my first article is as good a starting place as any. Companies that take the time to understand the value of their data and classify it, will not only help ensure compliance with the eight principles of the Data Protection Act, but will also gain long-term benefits of more economical storage, improved search and quality-of-service improvements.

About the author:
Michael Cobb, CISSP-ISSAP, is the founder and managing director of Cobweb Applications Ltd., a consultancy that offers IT training and support in data security and analysis. He co-authored the book IIS Security and has written numerous technical articles for leading IT publications.

Read more on Application security and coding requirements