Data leak prevention: Mistakes in database design, business processes

Even when well-enforced policies, staff training and data leak prevention (DLP) devices are in place, data leakage often still occurs because of poor business processes or database design. Michael Cobb reviews common problems that he has seen in the field.

In a previous article, I looked at some of the issues that need to be taken into account when trying to solve the problem of stopping sensitive information escaping from your organisation. However, while well-enforced policies, staff training and data leak prevention (DLP) devices all play a vital role, data leakage often still occurs because of poor business processes or database design. In this article, I want to look at a few of the problems I've come across in these areas.

SQL injection attacks
A common mistake is to allow applications to display a detailed error message when errors occur. Hackers typically can test for SQL injection vulnerabilities by sending inappropriate input into a site's Web forms to try and generate an invalid SQL query. If the server returns an error message containing information about the structure of the application, network or database, the attacker can use those details to stage further attacks.

The core problem here is poor coding. Since coding errors, however, are always a possibility, make sure your applications have a safe mode which they can return to if something truly unexpected occurs. By all means, log any errors for your own records, but make sure your developers use a structured exception handler like try {} or catch {} instead of function-based error handling. It is also important to remove all debug error handlers from the production code. Structured exception handling is far more reliable as an exception can be handled regardless of how many other functions are called. The exception can be addressed without the need to check and pass on the return code of every function call made.

Data inference methods
A less obvious leak occurs when sensitive information can be inferred from answers to valid queries. For example, date of birth, gender and town may provide useful information for an advertising campaign, but together they could potentially enable a salesperson to re-associate a customer with his or her purchase records (a re-identification disclosure). Even if the dataset used by the sales department has had individual customer names and email addresses removed, research shows that about half the population can be identified from three pieces of information: date of birth, gender and town.

If, in our example, the sales department was part of a pharmaceutical company and the sales being analysed were for prescription drugs, the salesperson could possibly deduce that a customer had a particular disease (a predictive disclosure) resulting in a serious breach of his or her privacy. Because of this kind of data inference problem, it is important to give careful consideration when including any sensitive data in an analysis. Where possible, you should take steps to anonymize the information; instead of providing date of birth, for example, you should use age groups.

Database index timing attacks
Finally, if your organisation is planning to build a new database, the architects should consider the risks from index-timing attacks and make modifications to the data model and application code if this is seen as a potential threat.

Database indexes, much like indexes in text books, provide the database with quick reference points on where to find the requested information. They, however, can be used in a timing attack against the database. The attack uses a series of insert operations, functions typically available to all database users, to find weaknesses in the database's indexing algorithm and to extract data from indexed fields. The insertion commands do not exploit flaws in any application logic or code, but by measuring the time it takes to respond to certain queries, an attacker can glean information from the database.

The primary recommendation to prevent this type of attack is to not use indexes on confidential data. The big drawback with this solution is that the database server would then have to perform a full-table scan, searching every row to find the particular one matching, say, a given bank account or national insurance number.

Complex queries across multiple tables also depend on indexes to retrieve data from queries efficiently. Removing indexes from these tables would have a significant impact on performance, too. These delays would cripple most large commercial databases.

Application firewall security tips
A better approach is to tune application firewalls to detect unusual patterns of activity and monitor the database for insertion attempts; a stream of inserts into a table over a short period of time could indicate a potential attack. In order to negate an attack, each column in a table that must be indexed and contains confidential data, such as a bank account number, should have a corresponding column in which to store the hash value of the confidential data. This hash value can then be used for database indexing.

An attacker will not be able to identify the confidential data from the hash value. Applications can still search for the data efficiently by performing a search on the indexed hash value column and passing the hashed value of the data as the search criteria.

As you can see, application firewalls can play an important role in data leakage prevention. They can prevent attacks, such as SQL injections, and warn of unusual patterns of activity. Vulnerability scanners, too, will certainly improve the overall robustness and security of an application by highlighting poor coding. The devices, however, do not detect all attacks, and cannot prevent information leakage through data inference. It is, therefore, down to human endeavour to identify business logic flaws or situations where legitimate data analysis may be putting data at risk.

Security is all about layering your defences. Common sense has to be one of those layers, because technology alone can never deliver the best possible protection.

About the author:
Michael Cobb, CISSP-ISSAP is the founder and managing director of Cobweb Applications Ltd., a consultancy that offers IT training and support in data security and analysis. He co-authored the book IIS Security and has written numerous technical articles for leading IT publications.

Read more on Privacy and data protection

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.