The Pentaho brand is now a fully signed up card-carrying element of Hitachi Vantara.
But making good on its promise to invest in what was a company and is now a brand/product, the PentahoWorld 2017 user conference saw Hitachi Vantara launch the the Pentaho 8.0 version release.
This data integration and analytics platform software is now enhanced with support for Spark and Kafka to improve data and stream processing.
Note: Apache Spark is an open source parallel processing framework for running large-scale data analytics applications across clustered computers. Apache Kafka is a distributed publish-subscribe messaging system designed to replace traditional message brokers.
Hitachi Vantara also points out product enhancements to Pentaho which see it up its ability to match compute resources with business demands, in real time.
According to analyst-style estimates from IDC, the global datasphere will grow to 163 zetabytes by 2025.
IDC also forecasts that more than a quarter of that data will be real-time in nature, with IoT data making up more than 95-percent of it.
If these predictions hold any water, Hitachi Vantara acquisition of (and investment in) Pentaho would appear to be fairly validated.
“We want to help customers to prepare their businesses to address this real-time data deluge by optimising and modernising their data analytics pipelines and improving the productivity of their existing teams,” said the firm, in a press statememt.
New enhancements to the Pentaho 8.0 platform include:
- Stream processing with Spark: Pentaho 8.0 now enables stream data ingestion and processing using its native engine or Spark. This adds to existing Spark integration with SQL, MLlib and Pentaho’s adaptive execution layer.
- Connect to Kafka Streams: Kafka is a very popular publish/subscribe messaging system that handles large data volumes that are common in today’s big data and IoT environments. Pentaho 8.0 now enables real-time processing with specialized steps that connect Pentaho Data Integration (PDI) to Kafka.
- Big data security with Knox: Building on its existing enterprise-level security for Cloudera and Hortonworks, Pentaho 8.0 now adds support for the Knox Gateway used for authenticating users to Hadoop services.
“On the path to digital transformation, enterprises must fully exploit all the data available to them. This requires connecting traditional data silos and integrating their operational and information technologies to build modern analytics data pipelines that can accommodate a more connected, open and fluid world of data,” said Donna Prlich, chief product officer for Pentaho software at Hitachi Vantara. “Pentaho 8.0 provides enterprise scale and faster processing in anticipation of future data challenges to better support Hitachi’s customers on their digital journeys.”
Also here we see find enhancements to optimise processing resources. The firm says that every organisation has constrained data processing resources that it wants to use intelligently, guaranteeing high availability even when demand for computation resources are high.
To support this, Pentaho 8.0 provides worker nodes to scale out enterprise workloads: IT managers can now bring up additional nodes and spread simultaneous workloads across all available computation resources to match capacity with demand.
This matching provides elasticity and portability between cloud and on-premises environments resulting in faster and more efficient processing for end users.
Pentaho 8.0 also comes with several new features to help increase productivity across the data pipeline. These include granular filters for preparing data, improved repository usability and easier application auditing.
For more on this subject read Hitachi Vantara PentahoWorld 2017 major trends in data clarified.