A new native integration arrives this week between open source business analytics tool Pentaho Data Integration (PDI) with Storm and YARN.
NOTE: YARN is a resource manager for Hadoop intended to work so that it does not have to totally rely on the MapReduce programming model (with its slow batch processing format) and Storm is a popular data streaming technology that Twitter has open sourced.
Sometimes you may notice ‘Storm-on-YARN’, because in Hadoop, you need YARN to run Storm.
What this essentially represents is a route for developers to process big data analytics in real time.
The firm claims that by building on the advancement of platforms like HDP 2.0, Pentaho Labs is “future-proofing” big data investments.
The air conditioning argument/example
Analysing data from real time processing has already started to make an impact on everyday life, by allowing companies to drive “meaningful actions” (yes sorry, marketing speak there) at the right time — and so examples include triggering a relevant offer on a mobile device while shoppers are standing at a checkout line or monitoring sensor data like HVAC (heating, ventilation & air conditioning) to maintain optimal building temperatures.
“YARN is enabling Hadoop to be used as a flexible multi-purpose data processing and analytics platform,” said Matt Aslett, Research Director, Data Management and Analytics, 451 Research.
“We are seeing growing interest in Hadoop not just as a platform for batch-based MapReduce but also rapid data ingestion and analysis, especially using Apache Storm. Native support of Storm and YARN from companies like Pentaho will encourage users to innovate and drive greater value from Hadoop.”
“Our customers are facing fast technology iterations from the relentless evolution of the big data ecosystem. With Pentaho’s Adaptive Big Data Layer and Big Data Analytical Platform our customers are “future proofed” from the rapid pace of the big data environment,” said Richard Daley, founder and chief strategy officer, Pentaho.
Next time you have to explain big data analytics and the Internet of Things (IoT) and even the Internet of Everything (IoE), perhaps try using air conditioning.