There are actually many sources of unstructured data, which accounts for much of the data growth that we've seen in the enterprise. Estimates dating back to 2003 note that structured data only represented around 15% of all the data we use on a daily basis, and everything else is considered unstructured.
Right now, your most significant sources of unstructured data are email and file services; both are generating a lot of data. Remember, file services doesn't just include spreadsheets and Word documents. We're talking about video files, audio files and image files -- rich data that is very difficult to control. With email, consider how we "Reply to All" and forward messages that duplicate and proliferate a message many times over -- often with attachments. Ultimately, the unstructured data is really piling up and cluttering our file servers and other infrastructure.
Listen to the Unstructured data FAQ audiocast.
For more ComputerWeekly podcasts click here
This was first published in March 2007