Q: Can big data challenges be tackled data analytics techniques? What has been your experience about this at Yahoo!?
A: Lately the industry is observing an explosion in data growth with the rising popularity of Web 3.0 and social media. This data cannot be overlooked. Companies need to use some data analytics techniques to get insights about consumer behavior and industry trends. Given the explosive data being generated on social media, huge amounts of data (big data) needs to be analyzed.
By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.
Big data, as the name suggests, cannot be handled by traditional RDBMS. Data analytics techniques like scalable systems and software are needed to analyze terabytes and petabytes of data. To give an example, Yahoo! processes petabytes of data on Hadoop systems. Many business units have loosely implemented the data warehousing STAR model to be able to explore data and answer unknown business questions. Instead of investing in proprietary, commercial solutions which may not scale, Yahoo! chose the path of Hadoop to tackle exploration of petabytes of data and gain industry insights and consumer behavior.
We all know that Grid/ Hadoop is not yet mature enough to be able to replace traditional RDBMS systems for data warehousing needs. It requires huge investments in programmers to be able to explore data analytics techniques on it. Your focus should not completely shift to new data analytics techniques as they still lack features that the traditional RDBMS systems have (the ones that help in analyzing data quickly and with minimal effort). Engineers have to develop custom code or user-defined functions for their analysis needs. The adoption of data analytics techniques using open source technologies may entail these as the upfront costs.
Related Q&A from Rohit Chatter
Yahoo’s data and business intelligence architect, Rohit Chatter, answers the latest debate, Star versus Snowflake schema, by breaking down the ...continue reading
Hadoop tools vendors stress that they can help you to make your data and reporting issues vanish. Know how true this claim is.continue reading
A KPI can help stay on track when assessing any project. Let’s take a look at what are the top three KPIs for a CDI effort.continue reading