LexisNexis fights crime with storage

The document search company is offering a new data forensics service staffed by former federal agents and powered by commodity NAS.

Not one to be outdone by Google, Web search competitor LexisNexis is also getting further into the software-as-a-service (SaaS) game with a new forensics service launched this week. And like Google, LexisNexis said it's relying on cheap, off-the-shelf storage to get the job done.

LexisNexis has already offered e-discovery services for years, but this week, the company launched a more intensive level of data collection service, which it calls the Data Collection and Forensic Services Lab, at its headquarters. The lab, according to a LexisNexis press release, "specialises in cases involving fraud, theft, embezzlement, employee misconduct, deception, patent infringement, industrial espionage and other white-collar crimes in which additional data processing services are needed to determine how data has been deleted, accessed or manipulated."

More information
Google steps into academic storage

Users rethink Amazon S3 after performance issues

College slashes storage costs with Google Gmail

Amazon search unit deploys SAN file system
The forensics team at the lab consists of a handful of federal law and IT experts, who either have a background in government or as IT pros for big accounting or telecom firms, according to Tom Williams, director of data collection and forensics services for LexisNexis, himself a former federal agent.

The lab itself has approximately 50 terabytes (TB) of commodity network attached storage (NAS) ready to aid in investigations, and according to Williams, it's the availability of cheap, off-the-shelf disk that has really made the service possible.

"The costs of disk storage have come down so much that it has allowed us to use different techniques than we've been able to employ in the past," Williams said. The lab has NAS systems from a multitude of vendors, he said, from Apple XServs and XRAIDs to units available at Best Buy. "We have NAS suppliers on speed dial," Williams said. "We can snap in capacity very quickly, and we buy based on price."

When Williams first started out in data forensics, "buying and imaging a hard drive was so expensive that we relied on $100 40 GB tapes," he said. Gradually, he moved up to building RAID arrays using old controller cards and IDE drives, up to about 100 GB of storage.

"Now you can go down to your local supply store and buy a multiterabyte NAS box," he said. "That kind of 'snap-in-storage' has made it possible to index vast amounts of data." Among the applications, such vast indices make possible for forensics, are services like the LexisNexis lab's "rainbow tables," 9 TB of pregenerated encryption keys used to crack open encrypted or password-protected files.

"It's like having a giant file cabinet filled with keys," Williams said. "But instead of having to try every one, we can run through them quickly using a software engine, since they can all be stored in one system at once."

Cracking an encrypted or password-protected file, for instance in the case of an employee who leaves a company without restoring access to their password-protected Windows document folders, can now typically be done in about a day, according to Williams. "It used to take weeks."

The 50 TB capacity doesn't represent all of the data being processed at one time, Williams said. The company extracts data from customer sites using portable hard drives, or by removing hard drives from onsite machines and loads only the data it needs for each case onto the NAS systems for index and search.

"One of the most important services we offer is harvesting data from our customers' environments in a forensically sound manner, without altering metadata, like access and modified dates on files," Williams said. Those records are stored throughout an investigation on the separate hard drives in a secure, temperature-controlled vault at the Bellevue facility. After the lab is finished processing forensic data on the NAS systems, it performs Department of Defense (DoD) wipes on each disk, writing random characters over every bit on the disk, then ones, then zeroes.

"We only work from copies of the original data," Williams said of the erasure process. Maintaining the vault of disks helps cut down on power and cooling costs associated with archiving and also makes the original data less susceptible to tampering, he said.

Given the nature of his work, Williams said he's not at liberty to discuss any particularly juicy customer case studies. He did, however, volunteer the story of "the most inappropriate use of the Internet" he has ever seen, which involved a middle manager who was about to be terminated because he was downloading explicit material from the Internet onto his work computer.

"The client company, in this case, had warned him twice and sent him for counseling once," Williams said. "Now they are building a case for termination."

His lab stepped in to re-create the manager's browser history, creating a list of URLs that Williams said made him glad, once again, for cheap storage. "It was over 600 pages long," he remembered.

Read more on IT risk management