Plumbing the depths of a data lake future

Data lakes filled with Citizen Source? It could happen says Nick Booth

An email filters into my inbox heralding the Smart City Expo in Barcelona in November. (November?!! Is that the time already?)

More worrying than the passage of time are the future urban trends it predicts. In future cites will have, among other things, elastic environments, social pairing, crowd planning and creative cultures.

Good grief, I don’t like the sound of any of that. Since when did committees make good decisions?  Still, the good news is that none of these things are likely to come to fruition. Very few futurologists ever get anything right, and it doesn’t matter because no-one’s interested in history, so nobody will check what anyone said two years ago.

Analysts at the Gartner Group have been fulminating against another challenging projection. In a report, The Data Lake Fallacy: All Water and Little Substance, Gartner analysts have some stern words for the vendors who’ve been confusing everyone with their fanciful notions of ‘data lakes’.  (I must admit I missed the whole data lake craze.)

The vendors who marketed data lakes have shot themselves in the foot, say the analysts, as they confused everyone so much that we were all terrified to invest anything. So big data opportunities were postponed because nobody could understand a word of what the various marketing managers were saying.

As so often happens the message was inconsistent, and incoherent, as marketing managers competed with each other to reposition themselves differently, on a subject they didn’t really seem to understand in the first place.

The resulting lack of alignment about what comprises a data lake, or how to get value from it, frightened off investors and blocked potential revenue streams for resellers.

"Data lakes are marketed as enterprise wide data management platforms for analysing disparate sources of data in its native format," says Nick Heudecker, research director at Gartner, in a bold attempt to explain this nonsense. "The idea is simple: instead of placing data in a purpose-built data store, you move it into a data lake in its original format. This eliminates the upfront costs of data ingestion, like transformation. Once data is placed into the lake, it's available for analysis by everyone in the organisation."

The mistake of the marketing hordes was to assume that all audiences are highly skilled at data manipulation and analysis, according to Gartner’s analysts.

The lack of understanding of what constitutes a ‘data lake’ is counter productive, says Andrew White, vice president and distinguished analyst at Gartner. That confusion is caused by marketing executives trying to give a unique spin to their products, and frightening off investors, he says.

Data lakes do actually exist though. And they can provide value to various parts of any organisation, insists Gartner’s White. But the proposition of enterprise wide data management has yet to be realized, he says.

Well, that’s where we part company. I don’t believe in data lakes at all.  I intend to prove they don’t exist when I attend The Smart City Expo and put it to the vote, through some ‘Citizen Sourcing’.

Read more on Enterprise Storage Management

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.