It has been said that Hadoop is hard.
More specifically, it has been said that the Hadoop framework for distributed processing of large data sets across clusters of computers using simple programming models is tough to get to grips with because:
- Hadoop is not a database
- Hadoop is not an analytics environment
- Hadoop is not a visualisation tool
- Hadoop is not known for clusters that meet enterprise-grade security requirements
This is because Hadoop is a “foundational” technology in many senses, so its route to “business usefulness” is neither direct or clear cut in many cases.
IBM pays analyst firm Evans Data Corporation for what are widely regarded as worthy reports — in this role, Evans has cited IBM as an “industry leader” for making Hadoop more accessible, scalable and reliable for developers in a new analyst survey.
In an attempt to justify its stance (and, presumably, its fee) in this regard, Evans data carried out a survey of more than 1,000 big data developers.
Impartial data or not?
The analyst house did not specify how it selected these developers and whether it specifically targeted and questioned confirmed existing users of IBM technologies.
Over 25 percent of respondents identified IBM’s Hadoop as their principle distribution.
The survey also focused on key growth areas such as machine learning and streaming analytics, where 18 percent of developers cited IBM InfoSphere Streams as their preferred application for machine learning, making it the second most popular choice in the category.
IBM also recently conducted an independently audited benchmark (1), which was reviewed by third-party Infosizing, of three popular SQL-on-Hadoop implementations and the results showed that IBM’s Big SQL was the only Hadoop solution tested that was able to run all 99 Hadoop-DS (2) queries.
“Our platform for Hadoop helps data-intensive applications manage and analyse petabytes of big data by providing clients with an integrated approach to analytics, helping them turn information into insight,” said Beth Smith, GM, analytic platform, IBM.
Smith says that this new report and benchmark are proof that customers can ask more complex questions of IBM when it comes to Hadoop implementation.
IBM real world efforts?
More than 200,000 developers regularly use Big Data University, an online educational site sponsored by IBM and run by new and experienced Hadoop, big data and DB2 users who want to learn or contribute course materials.
Thousands of developers also participate in IBM big data meet-ups held around the world, which are free events offering developers an opportunity to learn about and experiment with Hadoop, SQL-on-Hadoop, and other big data technologies.
Yes things might just be getting more accessible thanks to some of these efforts and those of other firms in this space other than IBM.
Embeddable reporting with operations-optimised geospatial analytics anyone?
Image credit: IBM