Cloud analytics: the future, but not today

Cloud analytics might seem an ideal marriage but it is still at a getting to know you stage. Nevertheless, analysts predict a happy future, while Pandora Radio uses data analytics to target customers.

Cloud computing is making steady inroads into business technology. But when it comes to cloud analytics, the jury is still out.

According to a 2010 IDC survey, published as 'The Maturing Cloud: What the Grateful Dead Can Teach Us About Cloud Economics', 50% of respondents said they were likely to use the cloud for analytics or business intelligence (BI). But that leaves just as many unconvinced. And even normally bullish vendors struggle to produce reference deployments for cloud-based BI. Although on-demand or cloud business intelligence software has been on the market for several years now, even some BI vendors say it remains far from a mainstream technology.

"We have had an on-demand business intelligence offering for three years now, through, but demand for on-premise business intelligence is several magnitudes greater, even today," said Jason Rose, a senior director for BI solutions at SAP.

About 95% of SAP’s BI business involves on-premises software, Rose said. He added that customers are looking at cloud BI and, in some cases, deploying it for limited uses. "We are especially seeing it being used for prototyping, where there are not as many worries about back-end integration and where the cloud makes it easier to share the project with stakeholders. But when it comes to rolling out [live deployments], we are still seeing most customers do that on-premise."

Cloud analytics a match made in heaven?
In some ways, organisations’ reluctance to roll out analytics in the cloud might seem a surprise

Cloud computing lends itself to analytics in several ways: Analysis work often has peaks and troughs of demand, increasing the hardware costs associated with on-premises solutions; BI projects are often urgent and designed to meet an immediate, but sometimes also temporary, business need; growing data volumes are putting more pressure on processing and storage resources, which could be supplied through the cloud; and analysis reports need to be shared, again boosting the case for a hosted resource.

"There is, in fact, a tremendous amount of analysis being done in the cloud," said Bill Gassman, a research director covering BI at Gartner, a technology consulting and research firm. "But it is primarily in conjunction with other applications or analysis tools, such as Web analytics. Most of that is being done in the cloud, but we are not seeing as much take-up via Amazon [Web Services] or as Software as a Service [platforms]."

Often, vendors selling cloud BI capabilities are finding that their best route to market is to sell their services as analytics add-ons to other Web-based technologies, rather than directly to end users, Gassman said.

One reason is that, despite the potentially unlimited processing power and storage capacity of the cloud, physical and practical barriers to cloud analytics remain. Businesses are handling ever-larger volumes of data and, whilst "big data" can bring benefits in terms of more accurate decision making and analysis, moving such data to a remote computing environment adds another processing step and a potential delay in data availability.

Even when a business usually processes data in batches, such a delay could add latency to the analysis process that outweighs the benefits of using the cloud. That could be compounded in more operational data analytics applications, such as in financial services, where the need for real-time or near-real-time processing means that data is best analysed as close as possible to its source.

In addition, unless organisations deal only in anonymised data that contains little in the way of sensitive information or intellectual property, CIOs and BI programme managers need to consider data protection and data security issues, as well as the quality of the raw data, when evaluating possible cloud-based analytics deployments.

By the same token, though, some businesses are turning to cloud analytics because some or all of their data originates in the cloud, so analysing it there removes the extra step of loading it to a local system.

For example, Internet radio company Pandora Media is using Software as a Service reporting tools from BI vendor GoodData to analyse its own site-usage statistics and rankings information from external data suppliers. "We did have some suspicions about storing data in the cloud, but we have largely moved on from that," said Mark Brennan, senior director of IT at US-based Pandora (see case study below for more about Pandora’s cloud BI deployment).

And as companies tap into more external data sources, or use Web-based applications such as to manage their business processes, the case in favour of cloud analytics can only strengthen, according to some analysts.

Technology drivers of cloud analytics
"Cloud BI will be increasingly commonplace," predicted Conrad Thompson, a cloud computing consultant at London-based PA Consulting Group.

Thompson attributes the expected surge in interest to three factors. "Storage is becoming cheaper in the cloud, and the business only has to pay for what it uses," he said. "The second factor is processing power: processing also costs less and is more flexible. And thirdly, there are more [cloud-based] tools in the market that allow you to read across the data, or carry out processing. The cloud is allowing companies to do analysis that was previously not economical."

According to Thompson, industries such as pharmaceutical research are making more use of cloud computing, in part because it allows them to improve their BI capabilities. But, he suggests, businesses need to distinguish between workloads that involve large data sets and those in which quick processing is important.

"If you look at applications where the amount of data stored goes up rapidly, such as processing video or pictures, cloud is an obvious platform because it reduces the up-front investment," Thompson said. But he added that the cloud is not yet as suited to real-time analytics, which requires frequent loading of fresh data.

Another key factor is the maturity of an organisation’s underlying IT and analytics architecture. If a company has a modern BI infrastructure, the business case for moving to the cloud may be less clear, suggests Michael Heric, a technology partner at consulting firm Bain & Co.

"The economics of the cloud will improve dramatically over the next three to four years, and we are already seeing a 25-30% cost advantage for some workloads," he said.

But Heric thinks the most compelling investment cases apply to businesses with older-generation analytics tools, where a move to the cloud can be a quick and cost-effective way to upgrade to the latest technology.

Case study: Pandora Radio uses advanced data analytics to target listeners
Pandora Media takes a different approach to streaming music over the Internet than most other companies do.

Most Internet radio stations are either online streams of existing AM or FM broadcast stations or replicate their formats closely. Pandora's aim is to create customised radio “stations,” based on listeners' preferences and tastes.

To do that, its founders set up the Music Genome Project in 2000. Music tracks featuring songs dating from the Renaissance to today are analysed according to 400 criteria. When a listener tells Pandora that he likes a particular song, the company’s database uses the results of the analysis to find and play others with similar attributes.

But the Music Genome Project is only one way that Pandora uses advanced data analytics. Since the site was founded 11 years ago, it has built up an audience of 100 million registered users and 39 million regular listeners. Pandora transmits 1.8 billion hours of music and serves 1.5 billion adverts each year for its free, ad-supported service, although the company also has 1 million paying subscribers.

According to Mark Brennan, Pandora’s senior director of IT, it is the advertising market that puts the most pressure on the company’s analytics capabilities – and that is the area where he is increasingly looking to the cloud for BI.

"Analysis, for a media company, is a very big topic," explained Brennan. "As well as our internal data, we have outside data from [market research companies] Nielsen and comScore for rankings and Web visitor numbers, and we need to compare these to our internal data."

Although Pandora has some sophisticated data analysis systems, including Hadoop clusters, its mechanisms for retrieving data were "fairly basic," Brennan acknowledged. Business users relied on Excel spreadsheets or custom SQL queries, which had to go through the engineering team. "The Excel tables were created by our analytics staff, but the data would be stale by the next morning," said Brennan. In addition, although the Hadoop clusters handled internal data well, much of the data Pandora relies on is held elsewhere by the market research firms.

To bridge the gap, Brennan introduced cloud-based reporting tools developed by GoodData. Hosted by the vendor, the GoodData software can handle Pandora's own statistics and the external information while supporting queries and reports in a format that Brennan describes as "self service."

"We wanted to democratise it somewhat and put the power of data into the users' hands," he said. "We would like to reach the point where a non-technical user can look at the tables that are available [in the BI application], drag that into a project and build a query."

When it comes to where data should be stored and analysed – on-premise or in the cloud – Brennan said he is now more open-minded than he used to be.

"It is largely a practical consideration around 'big data,'" he said. "How big is the data before it is difficult to move out to the cloud, if it is on-premise? But then a lot of our data comes from other clouds, and certainly we have not pushed into any volume limits so far." Accessibility and ease of use, he suggests, trump technical issues such as where data is stored.


Stephen Pritchard is a journalist and broadcaster based in London. He has covered the technology and IT industries since the mid-1990s and has contributed to publications including Computer Weekly, The Independent, The Financial Times and CNBC Business.

Read more on Business intelligence and analytics