This is a guest blogpost by Suresh Sathyamurthy, senior director, emerging technologies, EMC
Data lakes have arrived, greeted by the tech world with a mix of scepticism and enthusiasm. In the sceptic corner, the data lake is under scrutiny as a "data dump," with all data consolidated in one place. In the enthusiasts' corner, data lakes are heralded as the next big thing for driving unprecedented storage efficiencies in addition to making analytics attainable and usable for every organization.
So who's right?
In a sense, they both are. Data lakes, like any other critical technology deployment, need infrastructure and resources to deliver value. That's nothing new. So a company deploying a data lake without the needed accoutrements is unlikely to realize the promised value.
However, data lakes are changing the face of analytics quickly and irrevocably--enabling organizations who struggle with "data wrangling" to see and analyze all their data in real time. This results in increased agility and more thoughtful decisions regarding customer acquisition and experience -- and ultimately, increased revenues.
Let's talk about those changes and what they mean for the world today, from IT right on down to the consumer.
Breaking data silos
· Data silos have long been the storage standard -- but these are operationally inefficient and limit the ability to cross correlate data to drive better insights.
· Cost cutting is also a big driver here. In addition to management complexity, silos require multiple licensing, server and other fees, while the data lake can be powered by a singular infrastructure in a cost efficient way.
· As analytics become progressively faster and more sophisticated, organizations need to evolve in the same way in order to explore all possibilities. Data no longer means one thing; with the full picture of all organizational data, interpretation of analytics can open new doors in ways that weren't previously possible.
Bottom line: by breaking down data silos and embracing the data lake, companies can become more efficient, cost-effective, transparent -- and ultimately smarter and more profitable -- by delivering more personalized customer engagements.
analytics (Big Data wrangling)
Here's the thing about data collection and analytics: it keeps getting faster and faster. Requirements like credit card fraud alert analytics and stock ticket analytics needs to happen seconds after the action has taken place. But real-time analytics aren't necessary 100% of the time; some data (such as monthly sales data, quarterly financial data or annual employee performance data) can be stored and analyzed only at specified intervals. Organizations need to be able to build the data lake that offers them the most flexibility for analytics.
Here's what's happening today:
· Companies are generating more data than ever before. This presents the unique problem of equipping themselves to analyze it, instead of just store it -- and the data lake coupled with the Hadoop platform provides the automation and transparency needed to add value to the data.
· The Internet of Things is both a data-generating beast and a continuous upsell opportunity -- provided that organizations can provide compelling offers in real time. Indeed, advertisers are on the bleeding edge of leveraging data lakes for consumer insights, and converting those insights into sales.
· Putting "real-time" in context: data lakes can reduce time-to-value for analytics from months or weeks, down to minutes.
Bottom line: Analytics need to move at the speed of data generation to be relevant to the customer and drive results.
The rise of new business models
Data lakes aren't just an in-house tool; they're helping to spawn new business models in the form of Analytics-as-a-Service, which offers self-service analytics by providing access to the Data lake.
Analytics-as-a-Service isn't for everyone -- but what are the benefits?
· The cost of analytics plummets due to outsourced infrastructure and automation. This means that companies can try things out and adjust on the fly with regard to customer acquisition and experience, without taking a big hit to the wallet.
· Service providers who store, manage and secure data as part of Analytics-as-a-Service are a helpful avenue for companies looking to outsource.
· Knowledge workers provide different value -- with the manual piece removed or significantly reduced, they can act more strategically on behalf of the business, based on analytics results.
· Analytics-as-a-Service an effective path to early adoption, and to getting ahead of the competition in industries such as retail, utilities and sports clubs.
Bottom line: companies don't have to DIY a data lake in order to begin deriving value.
Overall, it's still early days for Data lakes, but global adoption is growing. For companies still operating with data silos, perhaps it's time to test the waters of real-time analytics.