Open for business: Hortonworks aims for open source profitability

It used to be the Hadoop Summit, but the strategic focus at Hortonworks the enterprise-ready open source Apache Hadoop provider, has evolved. So, this year it was renamed DataWorks Summit. The company now encompasses data at rest (the Hadoop Data Platform now in version 2.6), data in motion (the Hadoop Data Flow) and data in the cloud (the Hadoop Data Cloud). Hortonworks aims to become a multi-platform and multi-cloud company. The focus is on the data in data driven organisations. Just a few years ago Hortonworks connected with IT architects. Today it’s launching conversations with lines of business and chief marketing officers.

The company

Since the company launch in 2011 backed by Yahoo, Hortonworks has grown to over a thousand employees in 15 countries and customers in sixty countries. Its European presence is operated out of the UK with sales staff in North and Central Europe. It’s a young organisation with many newly graduated employees, strong on technology but lacking business domain insights. Many have maintained their links with universities to address big data and IoT issues. Hortonworks is involved in several joint R&D projects, in what Hortonworks co-founder Owen O’Malley terms the ‘community over code’ approach. One such project is the Digitisation of Energy, aiming to connect 1 million electrical car batteries to the grid to act as a sustainable energy reservoir.

Where’s the money coming from?

Sustained strong growth still evades Hortonworks. In response it is shifting product focus from selling converged Hadoop systems to IT departments, to selling data platforms to lines of business. Of its two main competitors, MapR remains a VC backed private company, while Cloudera is in the IPO funnel, touting its hybrid open source software (HOSS) model which ties open source elements with proprietary software for its enterprise‑grade platform. So Hortonworks may be tempted to add more proprietary elements to the open source Hadoop platforms, to increase its profitability.

Critical to maintaining an open source focus are the fast expanding fields of artificial intelligence and machine learning. Hortonworks is investing a lot of resources in developing open source code, and sees significant revenue opportunities across all business verticals. This is exemplified by its Hadoop data lake developments that encompass data analytics, mobility and IoT using Hadoop Distributed File System (HDFS) and persistent memory data structures. With increasing legal requirements for data to reside in specific geo-locations, computing must come to the data. This requires data tiering for ‘hot’, ‘warm’ and ‘cold’ data storage to optimise local computing power requirements.

Who’s helping?

Partners on the Hortonworks Data Platform include IBM, HPE, Dell EMC, Pivotal, Teradata and Microsoft. Data Flow partners have not been named yet, but several major carriers are Hortonworks customers, and may soon become partners. Especially if the Federal Communications Committee under Trump abandons its net neutrality stance and allows carriers to offer different Internet QoS (quality of service) levels. Hortonworks will help them develop differentiated services for their customers. Data Cloud partners are the two majors AWS and Microsoft Azure. Hortonworks also has domain expertise alliances with Accenture, Cap Gemini and Deloitte to roll out industry wide IoT and cyber security offerings.

Where’s the future for Hortonworks

Hybrid cloud, IoT, hyper-convergence, big data and AI all point to massive data accumulation and the need for mobile and multi-tier data processing. These are all areas where Hortonworks is active. This was exemplified by an automotive case study. Mercedes, a front-runner in the automotive market, operates with five levels of development, from yesterday’s ‘assisted driving’ to today’s ‘partially automated’ and tomorrow’s ‘conditional automation’. Then follows ‘high automation’ in 2021, and finally ‘full automation’ in 2025. Today’s top-of-the-line cars generate around 500GB of data per day. In ‘full automation’ mode, data volumes will go up to 50TB a day. That requires intelligence at the edge and real-time hand-off to cloud computing processes.

Hortonworks wants to be on that journey, not just with the automotive industry, but across many other verticals. The company believes that only open source can evolve fast enough and create the standards needed to keep up with the data frenzy.