Amount of data stored has doubled in three years, says study

The amount of information stored on various media such as hard drives has doubled in the past three years, to five exabytes of ...

The amount of information stored on various media such as hard drives has doubled in the past three years, to five exabytes of  information produced in 2002, according to a study released by the University of California.

The amount of information put into storage in 2002, five exabytes (one quintillion bytes), was equal to the contents of a half a million new libraries, each containing a digitised version of the print collection of the entire US Library of Congress, according to the study by professors Peter Lyman and Hal Varian of the UC Berkeley School of Information Management and Systems.

The professors estimated that between two and three exabytes of information was generated in 1999.

The study estimated that 92% of the data was stored on magnetic media, primarily hard drives.

The study does not address the quality of information and how people choose good information sources.

Significant differences exist in the "accessibility and usability and trustworthiness" of information between various sources, Lyman noted.

"We treated it all the same, simply to understand how much there was... but when you get into consumption, the discrimination over the quality of information, and how you make that decision, really becomes important," he added.

With the amount of stored information growing at a rate of about 30% a year, a "real change in our human ecology" is taking place. "Everything is public," he said. "Everything is on the record."

One problem with all this information being stored is that it is not always accurate. As information passes through multiple hands, it can be condensed or mischaracterised. So commentaries or reports on a speech or a paper 20 years ago sometimes contain distortions, he said.

The study underscores the need for companies to manage their information smartly, said Gil Press, director of corporation information at EMC, an information storage supplier. But IT solutions are not the only answer, because humans still need to look at information with a critical eye.

"We are getting swamped, and we need better ways to organise and manage information," Press said. "Hopefully, information technology will never replace smart thinking and human analytical thinking."

The amount of stored information is not all the information that is being produced. Electronic channels - including TV, radio, the telephone and the internet - produced three and a half times as much information as was stored in 2002.

Most of that information was exchanged through voice telephone calls and not recorded or stored, Lyman said. The telephone accounts for the largest percentage of information flow - 17.3 exabytes if stored in digital form - followed by e-mail, which generates about 400,000 terabytes of new information each year.

The researchers estimated that the world wide web contains 172 terabytes of information on public pages.

The UC Berkeley researchers used various methods to estimate the amount of information generated and stored, including statistics such as hard drive and paper sales, publication statistics and a sampling of the web.

One surprise for Lyman was that while digital storage continues to grow, the use of paper to transmit information is not shrinking.

His team estimated that the number of terabytes of information put on paper each year increased by 36% from 1999 to 2001, while the amount of data stored magnetically each year increased by 80% between 1999 and 2002.

North Americans each consume 11,916 sheets of paper each year, while residents of the European Union consume 7,280 sheets, the team estimated.

The majority of that paper information is produced by office documents and mail, not in formally published titles such as books or newspapers.

Grant Gross writes for IDG News Service

Read more on IT strategy