CW+ Premium Content/E-Handbooks

Thank you for joining!
Access your Pro+ Content below.
February 2017

Riding the elephant: how to manage big data

Sponsored by ComputerWeekly.com

Among all the talk in recent years about what big data is, and what organisations could do with it, there has been a bit missing about how to manage big data programmes and projects, which is a prerequisite of gaining business value from the analytics done on the data. Big data vendors – such as the Hadoop distributors – will say they see signs that big data science projects are giving way to large scale implementations. But how are user organisations managing those? How are they designing their IT organisations, and the rest, to manage and capitalise on big data? The lead article in this e-guide is focused on the management of big data. You can also read about the progress of open source parallel processing framework Apache Spark as, in part, a replacement for the MapReduce framework that was so important to an earlier generation of the Hadoop family of technologies; Spark also goes beyond the Hadoop Distributed File System [HDFS] to encompass relational and non-relational databases. You can also hear about how Hadoop is starting to appear in earnest in the UK public sector, and how another stripe of big data technology, NoSQL databases, is making its mark for specific use cases.

Table Of Contents

  • Big data projects need business input and careful management
  • Apache Spark grows in popularity as Hadoop-based data lakes fill up
  • Hadoop starts to trumpet way through UK public sector
  • NoSQL database technology finds use cases, but still minority sport
  • Autodata turns to big data to predict vehicle failures