Chris Wysopal, Lucas Nelson, Dino Dai Zovi and Elfriede
Dustin explain which security methods should be used to investigate
file formats.
Fuzzing File Formats
Applications such as Web browsers, image viewers, and media players
regularly process files provided by untrusted remote users. The
formats and encoding of these files, especially those used for
compressed images, video, and audio, are quite complex and thus are
difficult to parse securely. It is therefore essential that the
applications' processing of these files be properly scrutinised and
tested.
As an example of a common file format
vulnerability, consider the following code fragment. It is an
example of a style of code commonly seen parsing binary file
formats. The file format may consist of a file header and a number
of sections, each with section headers. Each section header
contains a section size field that describes how many bytes of data
are contained within that section. If the file format parsing code
uses these values unchecked in a memory allocation request size or
as an offset into the file, a denial-of-service or memory trespass
vulnerability may be likely. The following code does not check
the section size field read from the file section header. It reads
file data into a heap-allocated data buffer without validating the
size or checking the return value of HeapAlloc. This
presents several problems (see Listing 11-4).
Listing 11-4
A Common Binary File Format Parsing Vulnerability
FILE_HEADER fh;
SECTION_HEADER *sh;
ReadFile(hFile, &fh, sizeof(FILE_HEADER));
sh = HeapAlloc(fh.dwSectionSize + SIZEOF(SECTION_HEADER));
ReadFile(hFile, sectionData, fh.dwSectionSize);
Consider the case in which the value of the section size field
read in the file header is very large. If the allocation fails and
a buffer cannot be allocated, HeapAlloc returns NULL. When
the application calls ReadFile with a nonzero size and a
NULL buffer pointer, the application crashes with an access
violation. This causes an exception to be generated that the
application might catch and handle. If the application doesn't
handle it, an application crash occurs, indicating a possible
denial-of-service vulnerability. However, if the section size field
is set to be equal to 0 minus the size of the section header, the
HeapAlloc call allocates a 0-byte length buffer due to the
integer arithmetic overflowing and wrapping around 0. The
subsequent call to ReadFile below it attempts to write a
large amount of data to the 0-length heap block, causing a heap
overflow. An attacker may exploit this
vulnerability to achieve arbitrary code execution.
An application's file format handling should be tested against
improper and malformed files. The test methodology should generate
a series of malformed files by mutating properly formatted files,
generating random garbage files, and creating files likely to
trigger errors handling boundary conditions. The application should
be tested against each file to ensure that it properly handles each
one without crashing or causing unexpected behavior. The next
section describes automated file corruption testing and some freely
available file format testing tools.
File Corruption Testing
File corruption testing is a form of input fuzzing targeted at
applications and interfaces operating on binary input files. Common
applications of file corruption testing include testing image,
font, and archive file format parsing.
Testing an application's handling of a binary file format may be
performed at several different levels. A straightforward yet
labor-intensive approach is to manually create a series of files
that have been corrupted in different ways and proceed to attempt
to use the file in the application being tested. This approach
requires little or no programming; the file corruption can be
performed manually with a hex editor or by a small Python script.
With this approach, however, several issues arise. For example,
there may be a large number of test cases, and manually corrupting
the files and testing them may take too long—not to mention being
mindnumbingly boring.
Some level of automation can speed up this process of file
creation and testing to free the application penetration tester to
do other, more interesting things.
Automated File Corruption
Binary file formats can be complicated, involving a large number of
structures with type, option, size, and offset fields that may have
intricate interdependencies. They may also contain file sections
possibly involving compression or encryption. Manually creating
test cases requires in-depth knowledge of the file format. It also
may require "borrowing" a good deal of code from the application to
be tested to properly compress or pack data into the file format.
Although this sort of code reuse may save time, it may make some
bugs difficult to find because the same incorrect assumptions made
in file parsing would be assumed in the file creation. Luckily, by
deliberately avoiding intimate knowledge of the file format, we can
sidestep this pitfall and create a generic binary file corruption
test harness that can uncover a good number of vulnerabilities
quickly.
A simple tool to perform quick file format testing is Ilja van
Sprundel's Mangle.4 Mangle overwrites random bytes in a binary file
format's header with random values, slightly biased toward large
and negative numbers. Mangle requires a template file to mangle and
the size in bytes of the file header. As an example, let's mangle a
JPG file.
Want to learn how to mangle a JPG file? Or how to automate
the testing of local applications? Download the rest of
Chapter 11: Local Fault Injection
(.pdf)
Note: Printed with permission from Addison-Wesley. "The Art
of Software Security Testing: Identifying Software Security Flaws"
by Chris Wysopal, Lucas Nelson, Dino Dai Zovi and Elfriede Dustin.
Copyright 2007. For more information about this title and other
similar books, please visit www.awprofessional.com.