Sentiment in Twitter

I was at the IBM corporate headquarters today listening to Rod Smith, Vice President of Emerging Technologies and IBM Fellow, talking about Twitter. He explained that Twitter now has over seven terabytes of new data being uploaded everyday. Scanning that volume of data for anything to do with your own company is a major undertaking.

7TB! That’s a lot of text to scan. Rod explained how IBM goes into a client and shows them how sentiment analysis can be used to create some sense of all the comments. They suck out the Twitter data in time-bound chunks of around ten minutes and then analyse each chunk, with different words weighted more favourably to create a sense of the sentiment – good or bad – for a particular product.

The analysis needs to be fairly complex though. It’s no good weighting the word ‘wow’ as highly positive when a tweet reads ‘wow – this product sucks!’ so it needs to be sifted and merged and analysed to create an idea of what people are saying about particular brands.

This is fine for Twitter, but trying to apply the same kind of analysis to data within an organisation is difficult. Data is still very stuck in silos and rarely shared across an entire organisation.

Rod Smith, Vice President of Emerging Technologies and IBM Fellow