Code complexity analysis: How to keep it simple

Michael Cobb explains why simplifying your lines of code may help reduce attacks and improve the security of your applications.

A conversation I had recently with a client about my recent SQL injection attack articles made me a little uneasy. I was pleased that the organisation had undertaken a series of code reviews and tightened up its validation of user inputs, but the process sounded too much like box ticking rather than problem solving -- similar to how many people tackle compliance: "What's the minimum I can do to be in compliance? Let me tick the box and move on."

So I asked how the organization had conducted the code reviews. The response was by using automated scanners -- because the code was far too complicated to be reviewed manually. Given that the banking industry almost collapsed because it didn't fully understand the highly complex products it was creating, I want to look at the issue of code complexity analysis and its impact on application security.

Code complexity metrics and analysis
There is currently no definitive research on code complexity, and many analysts are still not convinced that code complexity is related to security vulnerabilities. What is certain, though, is that the number of bugs in a program has been directly related to the number of security vulnerabilities found in it, and I believe that code complexity cannot help the situation.

Counting source lines of code (SLOC), or the number of lines or "statements" in the source code, is one simple metric used to measure the size of a software program. However, functionality and complexity don't correlate that well as a skilled developer may be able to deliver the same functionality with far less code than a novice. Also, comparisons are difficult to make across programming languages.

Another metric, developed by Thomas J. McCabe, is called cyclomatic code complexity, and this measures the number of linearly independent paths through a program's source code. So, for example, if the code has a single IF statement containing a single condition, there would be two paths through the code: one path where the IF statement evaluates as TRUE and another where it evaluates as FALSE.

A number of studies have found a strong positive correlation between cyclomatic complexity and coding errors; modules that have the highest complexity and amount of code tend to contain the most errors.

It's fairly obvious to me that a module with higher complexity is going to be more difficult for developers to manage since they must understand each different pathway and the results of every possible pathway. McCabe recommended that developers should measure the complexity of the code they write and split it into smaller, less complex modules whenever the cyclomatic complexity exceeds 10. This approach has the added advantage that there are fewer execution paths to test in a particular module, so a complete test is far easier to accomplish.

The importance of a secure development lifecycle
Complex code is certainly harder to maintain, with security flaws more likely to creep in as the software evolves. But does this mean that we should avoid writing complex programs? Not at all. Look at Microsoft and Windows Vista; larger and more complex than its predecessors yet with fewer vulnerabilities. Microsoft has demonstrated that with a robust secure development lifecycle (SDL) in place, where security and privacy checks are enacted through the entirety of software development, you can write complex code without it necessarily succumbing to an increased occurrence of vulnerabilities. Without good policies and procedures behind your development, code complexity will cause you problems.

This is certainly true when many developers are involved in a project. The more developers there are, the more likely the chance that miscommunication will occur at the integration layer. For example, when developer A's function calls developer B's function, who is responsible for validating the inputs? Your SDL needs to define and document such issues as interfaces, function preconditions and post-conditions and the use of APIs to third-party components.

When designing functions and protocols, developers should provide as few run-time options as possible. This keeps the amount of code exposed to attackers to a minimum. It's no coincidence that many security flaws turn up in unused options and rarely-executed code. Code paths that are commonly invoked are being constantly tested by end users, while obscure functions receive little attention by either users or developers. Also it's not a good idea to use complex data structures on external interfaces.

The next step: Web application firewalls, scanners, code reviews?
But SDL is not the simple answer to application security. There isn't one.

Web application firewalls can prevent injection attacks, but they won't protect against application logic flaws. Scanners and code reviews won't find every bug, but by avoiding and reducing complexity when possible, you will increase detection rates.

Yes, Web 2.0 applications are complex, but they can be built using smaller, more easily tested modules and McCabe's advice still holds true in my opinion. Coding monolithic creations that your own developers don't fully understand and then relying on automated scans to interpret and debug them is tick-box security.

The aim of SDL is to reduce the number and severity of the vulnerabilities that make it through to the release version. Complexity does tend to increase vulnerability, so high complexity in a program or module should prompt a design review to stop it spiraling out of control. By solving problems at the code level and undergoing code complexity analysis you can dramatically improve the overall security of your applications.

Note: If you develop in Java you can use the JavaNCSS command-line utility to produce various metrics including cyclomatic complexity.

About the author:
Michael Cobb, CISSP-ISSAP is the founder and managing director of Cobweb Applications Ltd., a consultancy that offers IT training and support in data security and analysis. He co-authored the book IIS Security and has written numerous technical articles for leading IT publications.

Next Steps

Additional ways to measure code quality

Additional ways to measure code quality

Read more on Application security and coding requirements