IT professionals lack the skills to truly understand the big data opportunity for business and government, say...
Speaking at a roundtable debate organised by Autonomy, Sue Feldman, vice president at analyst IDC said, “Big data is a new set of highly integrated technologies that can use commodity hardware for scalability. While it is at an early stage there are very few Hadoop experts, the tools are very poor, there are not enough experts in text analysis and data integration is difficult.”
IDC defines big data as databases greater than 100Tb, real-time data analysis on a scalar computing architecture, or data in multiple formats like audio, video and free text.
Analysis can require extracting information from multimedia and applying some form of contextual analysis.
Feldman said, “You need to extract the bits [from multimedia files] to generate text that can be used for text analytics. The analysis will involve entity extractions, sentiment analysis and determining relationships between different entities. Many of these approaches to analytics are beyond the skill set normally associated with database and data analytics experts."
Furthermore, big data uses a probabilistic approach to data analysis, based on using statistical analysis on large data sets, which gives the likelihood that a piece of data matches a given criteria - as in pattern matching. This is very different to the relational database model, where a query result returns either a “yes” or a “no”.