If the deluge of headlines and vendor marketing materials is anything to go by, big data is the next big thing.
So how much is there really to all this big data talk? Is it all about putting new labels on existing product offerings, as some would contend? Or is it the diametrically opposed view, a bunch of interesting but immature technologies that aren’t yet ready for enterprise deployment?
And what about the demand side? Are companies asking for these solutions? Do they need them at all?
The short answer is “all of the above”. Yes, there are solutions out there where the only thing that has changed is the label on the metaphorical box. Whether or not the new label is applicable depends on how one chooses to interpret big data.
At the other end of the spectrum, there is a collection of emerging solutions that indeed add new capabilities, even if many of these aren’t quite mainstream enterprise-ready.
Growing demand for big data
But it is also clear that there is demand for solutions throughout the spectrum of established and emerging big data solutions, and the potential for advanced data storage, access and analytics technologies is recognised by IT professionals. This was confirmed in a recent Freeform Dynamics research study (Figure 1).
The need for new approaches and solutions also becomes clear when we look at the rate at which data volumes are growing, whether it is data going into existing relational database management systems (RDBMSs), documents stored on corporate servers, log files created by systems, websites or devices, external data streams supporting business applications, or indeed data from a variety of social media environments.
But it is not only the increasing volume of data that is causing headaches: nearly half the organisations in our survey are not making the most of the information assets they have residing in structured repositories, and hardly any are exploiting the data held outside of structured systems to any significant degree (Figure 2).
Being able to get more business value out of existing internal data assets is increasingly becoming a key requirement, whether it is to achieve more granular levels of customer segmentation, augment existing online analytical processing (OLAP) engines, fine-tune fraud detection systems, react in real time to events, prevent system failures in IT – the list goes on and on.
With some of these information needs, the value to be derived is already reasonably clear, i.e. the company knows it has the data and what to look for, but needs to find a way, or an improved method, of storing, accessing, analysing and presenting it. In other cases, it is not clear what value might be found: “panning for gold”, “finding a needle in the haystack”, and similar terms are typically used to describe these scenarios.
Read more articles on big data
- What's the big deal about big data analytics?
- Forrester: Big data – start small, but scale quickly
- Big data conspiracy theories abound
- Big data management changing the face of data warehousing
- Big data analytics projects easier said than done – but doable
Selecting the right tools for data analytics
These different starting points have a major influence on technology selection, as companies are not prepared to make speculative investments in expensive solutions just in case there is something to be found. This is where hybrid solutions come into play.
Existing proprietary – and typically costly – storage and database solutions are being supplemented by some of the more cost-effective emerging technologies, typically from the Apache Hadoop ecosystem. Preliminary exploration and analysis of large data volumes, where the "nuggets" are well hidden, can be carried out in a Hadoop environment. Once the "nuggets" have been found and extracted, a reduced and more structured data set can then be fed into an existing data warehouse or analytics system.
From that perspective, it makes absolute sense for suppliers of existing storage, database, data warehousing and analytics software to provide connectors and APIs to Hadoop solutions, or even put together integrated offerings that feature both the proprietary and open source elements.
While some of this rush to embrace Hadoop is no doubt defensive, it is overall a sensible and desirable move. As already mentioned, many of the new big data technologies are not ready for mainstream enterprise usage, and companies without the IT capabilities of the trailblazers or typical early adopters will welcome the support from established suppliers.
Hidden value in current technology
That is not to say they believe they will get it. The signs are that the hype surrounding big data is potentially hiding the real value of what is already on offer. IT professionals remain very sceptical with regard to suppliers’ ability to deliver, and claims that the end of the traditional RDBMS is nigh are simply not taken seriously (Figure 3).
In addition, there is a belief that suppliers are not geared up enough to support their customers with appropriate licensing and commercial arrangements when it comes to evolving data-related needs.
Overall, we are looking at an environment in which there have been lots of technology advances, the value and potential of which are, or are beginning to be, recognised. But there is much work to be done. On the supplier side, concerns about over-hyping and ability to execute must be addressed. On the enterprise side, business and IT need to work together to decide which data assets are most worth exploiting, and what business value the results could bring.
IT also needs to conduct a thorough assessment of what skills are available, what skills are needed, and how to bridge the gap. Otherwise, "big disappointment" will be the inevitable result.
Martha Bennett (pictured) is vice-president, head of strategy, at Freeform Dynamics