When can you throw away data?

Billy MacInnes finds out that working out if you can throw data away is something of a headache in an era of big data and concerns about working out just what might be useful

Earlier this week I was at a conference where one of the speakers, Colin Mahony, senior vice president and general manager for HP Vertica, admitted that “information is self-perpetuating” but argued that “throwing away data is like throwing money away to me”.

His colleague Andrew Joiner, general manager for emerging technology and marketing at Autonomy (in what was essentially a double act on the subject of Information Optimisation), stated that while businesses sometimes didn’t know what they were looking for, they wanted to collect as much data as they could to make sure they didn’t miss it. Admittedly, many didn’t have the technologies to take advantage of the data flowing through their operations but Joiner warned: “You can’t just throw away information simply because it’s too expensive to store it.”

To which my response would be: “Yes you can, if you have an idea of whether it is of any value or not.”

The problem is that very few businesses have any understanding of what data is valuable and what is rubbish. Worse still, the industry actively encourages them to store it all, anyway, just in case it does turn out to be valuable.

Now, I can’t be too critical because I admit to employing a similar strategy when it comes to putting stuff in my garage. Quite often, I end up keeping things in the garage on the off-chance that they might be useful at some point in the future. The end result of this strategy is that there comes a point, usually every 18 months to two years, where there is so much junk in the garage that we have to get a skip and I unceremoniously dump most of the stuff which I thought might be worth keeping.

The beauty of the IT industry, as far as I can see, is that customers can’t bring a skip in. Why not? Because most of them haven’t got a clue what  data is so useless that they can throw it away. As a result, they’re often too scared to do anything about it. Instead, they opt for the alternative of building a bigger garage to put the stuff in. In fact, to stick with the garage analogy, many of them have probably built so many garages by now that they take up more space than the original house.

Now they’re having to build walkways between the garages and put heat and lighting in them all. Guess who builds these ‘garages’? The IT industry.

That’s a pretty fantastic business to be in as far as I can see. From the customer’s perspective, though, I wonder if it might be worth pushing vendors to try and deliver systems that can decide ‘on the fly’ what data is worthwhile and what data is of no value so they can dump the stuff they don’t need. It would be a lot cheaper, more efficient and more green.

Mahony cited a fairly impressive figure that analytics pays back $10.66 for every dollar invested. I can’t help feeling, however, that if our approach to data creation and storage was more efficient in the first place, there would be less requirement for continuous investment in storage technologies and in analytics.

Read more on Data Protection and Data Backup Services

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.