FT takes data warehousing to the cloud and cuts costs by 80%

How the Financial Times beat its data warehousing woes with AWS Redshift and saved costs

The Financial Times Group picked AWS RedShift for its data warehousing tasks and reduced its data processing time by 98%. That’s not all. The public cloud service also helped it cut costs by 80%.

“Data is extremely important for us at FT,” says John O’Donovan, CTO at the Financial Times, at the AWS London Summit earlier this week. “It shapes what we do and informs us what to do next.”

Like many leading publishers, FT also has to go beyond just building apps to make its content available on smartphone and tablet devices. It also has to think about syndication – its data has to be available for readers on Flipboard, Samsung Smart TV, mobile devices and so on.

That’s why data warehousing is important for FT. data warehouse is a repository of data that an enterprise's various business systems collect. It is designed to facilitate information retrieval and analysis. The data contained in a data warehouse is often consolidated from multiple systems, making analysis across those systems quicker and easier.

Typically, a data warehouse is housed on an enterprise mainframe server. But more and more enterprises are taking it to the cloud to make it more efficient.

“We had a data warehousing system within FT. It was not rubbish but it was inflexible, it had limited features, it was slow and expensive,” O’Donovan says. “Besides, it was not in a single place.” This made data analytics and warehousing tasks difficult for the IT team.

Swifter query runtime

FT then chose to take its data warehousing task to the public cloud and selected Amazon Redshift.

“We were one of the first to use Redshift,” O’Donovan says. Amazon announced Redshift only in November 2012 at its first re:Invent conference and made it available from 2013 onwards.

“Traditional data warehouse products are too expensive and have licensing complications,” says AWS senior vice-president Andy Jassy at its launch. “Many large enterprises told us they are unhappy with the existing data warehousing services in the market.”

AWS Redshift is an automated, petabyte-scale data warehouse service in the cloud that helps enterprise IT automate labour-intensive tasks such as setting up, operating and scaling a data warehouse cluster. It also aims to help them to provision capacity, monitor and back up the cluster, as well as apply patches and upgrades.

FT’s IT team decided to build its data warehousing on AWS. In doing so, it cut costs by 80% and queries ran up to 98% faster.

“In comparison, a query on Redshift ran for just 6 seconds while our old data warehousing service took 29 seconds to answer a query,” O’Donovan demonstrated at AWS Summit.

“Our cloud-based data warehousing is very cost-effective but there were lots of other business benefits too.”

Efficiency and flexibility

For one, with quick data analytics and processing, it is easy for the business divisions in FT to find people with the right skills for the right tasks. “This helps us work more efficiently,” O’Donovan says.

Secondly, the IT team does not have to worry about capital expenditure while scaling up as the cloud offers it flexibility.

“We have access to real-time data at any time and so we do not have to spend time looking at logs or reports and make decisions far quickly than previously possible,” O’Donovan says.

Besides, IT is no more a black, non-transparent box performing mysterious tasks, he added.  

The cloud is also enabling the IT at Financial Times to be innovative and launch new services quickly. At the beginning of this year, FT launched FT Samsung Smart TV App that allows its subscribers to watch FT videos for free on any Samsung Smart TV. From a revenue perspective, the app gives new opportunities for advertisers to reach the FT’s influential audience on television.

According to Amazon, Redshift has no upfront costs and users can scale up to a petabyte  or more for $1,000 per terabyte per year, which is less than a tenth of most other data warehousing solutions.

Read more on Clustering for high availability and HPC