Yahoo is using machine data indexing company Splunk’s Hunk tool to do analytics on Hadoop and NoSQL data stores.
Yahoo employees are using Hunk, by which virtual indexes can be created in Hadoop, to explore, analyse and visualise data from its Hadoop environment, which stores more than 600PB of data.
Its teams are also analysing more than 150TB of machine data each day in Splunk Enterprise in IT operations, applications delivery, security and business analytics.
Yahoo is not a new Splunk customer. It uses Splunk Enterprise for the search engine company’s IT operations, infrastructure, products and security teams.
Hunk is an analytics platform designed to enable everyone in an organisation to interactively explore, analyse and visualise big data. It was brought into Yahoo to track and improve the overall performance and stability of its grid system.
More on Splunk
Yahoo monitoring architect Ian Flint said Splunk Enterprise and Hunk helps it gain insights into all of its data, whether it is streaming in real time or historical data.
"Hunk gives Yahoo deep visibility into our Hadoop data stores to help us continuously optimise operational performance," he said. "Insights we gain from Hunk help us save millions of dollars per year in hardware provisioning.”
Hunk helps Yahoo track system metrics from all of its clusters by region, visually browse complex tables, gain historical resource insights, improve data job service-level agreements, cut down development cycles, and search and troubleshoot IT issues in the grid system in real-time.
Splunk product marketing vice-president Shay Mowlem described Yahoo as the birthplace of Hadoop. "It is a great honour for Hunk to play such a critical role in Yahoo’s business and its Hadoop deployment,” he said.