Microsoft: big data analytics for everyone

Microsoft has its Build 2015 software application development conference almost within its sights now — as such, its programmer portals are currently gleaming like a new START button.


In related data centric news, the firm is this week announcing enhanced Microsoft data services for Hadoop alongside some related machine learning technologies.

This news comes just 24 hours after HP announced its Haven Predictive Analytics software was fit for operationalising large-scale machine learning.

These new services from Redmond reaffirm that “Microsoft is embracing open source” (the team is saying that a lot) and simplifying Hadoop (every body wants to do that, Hadoop is hard) for simplicity and ease-of-use.

Updates to Azure HDInsight include a public preview of HDInsight on Linux and general availability of Apache Storm for HDInsight.

What is Azure HDInsight?

HDInsight is a cloud distribution of Hadoop that has been architected to handle any amount of data, scaling from terabytes to petabytes on demand.

Microsoft says, “You can spin up any number of nodes at any time — we only charge for the compute and storage you actually use.”

What else is Microsoft announcing?

Hadoop 2.6 support in HDInsight, new virtual machine sizes, the ability to grow/shrink running HDInsight clusters, and a Hadoop connector for DocumentDB.

Microsoft also says it is simplifying machine learning for business with the general availability of Azure Machine Learning.

Why is machine learning a big deal?

As you will know, machine learning is a cousin of data mining in some senses i.e. it allows programs to detect patterns in data and adjust application actions accordingly. Given the growth in web services (you could call them cloud services if you wanted to), the use of machine learning in what are increasingly real time processing environments is on the rise.

Real-time & ‘push’ analytics

The company also announced a public preview of Azure Mobile Engagement, which is intended to offers ‘real-time user’ and ‘push’ analytics.

Why is Microsoft doing this?

Microsoft’s goal is to make big data technology simpler and more accessible to the greatest number of people possible i.e. not just big data engineers, data scientists and software application developers, but also IT managers and everyday businesspeople.

Whether that is too much power in the wrong hands will come down to individual customers, but there is an argument to suggest that we need to be careful here.

Microsoft’s T. K. “Ranga” Rengarajan, corporate vice president, data platform and Joseph Sirosh, corporate vice president of Machine Learning tell us not to worry and that, “[These new services] can help businesses dramatically improve their performance, enable governments to better serve their citizenry, or accelerate new advancements in science.”

Ed — did he just say “citizenry“, that’s a bit Fox News isn’t it? We’ll let it go. What else did he say?

According to Ranga and Sirosh, “Storm for Azure HDInsight, generally available today, is another example of making big data simpler and more accessible. Storm is an open source stream analytics platform that can process millions of data “events” in real time as they are generated by sensors and devices. Using Storm with HDInsight, customers can deploy and manage applications for real-time analytics and Internet-of-Things scenarios in a few minutes with just a few clicks.”

Should big data analytics be democratised for everyone?

Well yes, but let’s take it slowly please… nobody wants this bubble to burst, not even Microsoft.