Dr. Seuss, Yahoo! and Apache Hadoop data intensive apps

Writing on IT news website GigaOM, cloud and infrastructure journalist Derrick Harris has cited a source which has allowed him to confirm that Yahoo! will this week be spinning off a separate company focused on the development and commercialisation of the Apache Hadoop software framework that supports data-intensive distributed application

… and the name? Yahoo! HortonWorks

With this news strongly rumoured to finally break the annual Hadoop Summit this week, Yahoo! is widely-lauded as the initiator of Hadoop and its main contributor. The company itself uses the framework extensively within its own web operations.

“Yahoo!’s HortonWorks (as in the Dr. Suess book ‘Horton Hears a Who,’ a reference to the elephant logo that Apache Hadoop bears) will be comprised of a small team of Yahoo’s Hadoop engineers and will focus on developing a production-ready product based on the Apache Hadoop project, the set of open source tools designed for processing huge amounts of unstructured data in parallel,” writes GigaOM‘s Harris.

Harris goes on to suggest that HortonWorks will bring improvements that make Hadoop better suited for running production workloads to support data-intensive apps.

The birth of HortonWorks could and should mean that other players in the Hadoop space will have to up their game.

“Yet, HortonWorks will have to ensure it advances Hadoop development across industry lines and not just in a manner optimized for Yahoo’s webscale needs if it wants to gain adoption,” writes Harris.

Interesting times ahead. Watch this space — or if you can’t do that, then just watch the cartoon ok?