Optimising application performance with flash storage

At a Computer Weekly roundtable, IT leaders heard how Cancer Research UK achieved a 30% improvement in application performance

Responsiveness and agility are must-haves in today’s fast-paced business world and organisations can not afford to sit around waiting for applications to respond when there is work to be done and decisions to be made.

At a Computer Weekly roundtable debate, in association with EMC, IT leaders heard how Cancer Research UK (Cancer Research UK) achieved a 30% improvement in application performance in the face of a rapid growth in data volumes.

Michael Briggs, head of infrastructure at Cancer Research UK, the world’s largest charity dedicated to beating cancer through research, is under pressure to ensure every pound spent on IT helps deliver more investment into beating cancer and saving lives.

Fundraising operations to channel money into scientists’ research are critical and Cancer Research UK’s marketing team needs to optimise and fine-tune campaigns to raise awareness and generate the maximum amount of donations. However, running reports against ever-increasing data volumes was proving slow and time-consuming.

The journey to cloud

Applications are running in a virtualised, private cloud environment, but Briggs said of the journey towards cloud: “Storage was not the biggest thing on our list.” He said the organisation focuses on virtualising and server consolidation.

Cancer Research UK virtualised 95% of critical applications and data with VMware – a total of more than 700 Windows and Linux virtual machines.

“We virtualised everything. It was an ambitious project and at the same time we were doing this, we consolidated offices into a single building. We realised our strategy to re-utilise equipment, but as we approached the point of hardware refresh, we were coming close to performance limits and this was soon after we moved in. We stretched the money invested as much as possible,” said Briggs.

The two legacy EMC storage systems were running at 85% utilisation, and another programme was initiated to rethink how the hardware refresh could be done to rectify performance issues, because a like-for-like replacement of storage processing power would not be able to match data volumes growing at 30% per year.

Tackling IT leaders’ application performance problems

IT leaders attending the Computer Weekly roundtable discussed their application performance issues with Cancer Research UK (Cancer Research UK) head of infrastructure Michael Briggs and experts from EMC. These are the key questions and answers from the debate:

Did Cancer Research UK eliminate the problem of bad code? Did you look at the application layer before addressing storage?

The CRM system was already live. There were programming enhancements made to that and a lot of time spent looking at the application element, but developers were at the time also being swamped with requests for enhancements. Prioritising business needs against increasing performance issues needed careful thought to avoid increasing costs. It is not normal to throw hardware at a problem, but XtremSF flash gave the technical people 18 months’ breathing space to solve the problems of production and meet demand from the business. Throwing hardware at a problem goes against the grain, but this time it worked very well with a solid business case to do so.

You have virtualised everything. Are XtremSF cards in multiple virtualised machines?

Initially we bought physical machines and had non-virtualised Red Hat Linux with XtremSF, but now we are altering that. We are using VMware and have also bought EMC VPlex technology - this essentially virtualises the storage layer and allows for synchronous replication between datacentres. It redefines the need for disaster recovery. If we lost the entire primary datacentre, the data is already at the secondary - think of it as high availability datacentres for the applications you target.

How did you present the business benefits to the board?

My first concern is our supporters and ensuring I am adding value with anything we do. My business case is therefore partly a mental one, but the business case is solid. It’s important we don’t shy away from spending money on technology if it enhances what we are able to do in helping cure cancer and fund scientists to do so.

As an example, the business case for XtremIO includes targeting desktop virtualisation as part of the issues resolved. With regard to monetising the amount of time wasted waiting for PCs to come on – it is difficult to put your finger on that, but if you can justify buying flash for something else or a need for storage elsewhere, then you can solve other problems. We will solve application performance problems and reduce the costs associated with the amount of storage and an additional benefit will be to cure desktop virtualisation problems too. You can target one problem area and see what else you can use the solution for. We found that database problems are not about adding more memory - in every case you chase bottlenecks. We used XtremSF to make them run more efficiently, we will use XtremIO to enhance this and cut the storage cost by reducing the amount of disk required.

How can the development environment be improved?

Developers are an interesting problem to an infrastructure department - each developer would like their own personal copy of an application, if they were honest. If there are 200 developers, this could be a huge storage problem, but with XtremIO we could compress multiple copies of a system, and for example produce copies that would previously take hours or sometimes days, in minutes instead and not significantly increase the amount of storage required.

Marketing hampered by slow access to data

However, the most serious bottleneck was for customer relationship management (CRM) applications.

“We have a typical systems mix of CRM, finance and HR etc, but CRM is central to what we do for funding scientists through campaigns which revolve around CRM. We amalgamated into a single Siebel system, but performance dropped as it grew. The marketing team needs to run complex queries against 700-800 gigabytes of data to target the right people. The more times they can run queries to analyse and check the target audience the better the campaign, but some queries were taking up to eight hours to run,” said Briggs.

Searching for IOPS bottlenecks in applications, networks and storage led to engaging EMC partner CAE to help analyse the infrastructure and propose solutions.

As a result, Cancer Research UK embarked on an IT transformation programme to speed up data access and deployed EMC VNX unified storage at its primary and secondary datacentres, which includes flash drives in the storage tiers.

An outsourced disaster recovery service was replaced with EMC Data Domain integrated with EMC NetWorker, located at both datacentres.

The business case for XtremSF Cache

“We got our first XtremSF card - just six inches by three inches in size. Although size for cost may make it look expensive, we consider every pound we spend on IT very carefully. My wife, as an example, runs Race for Life and I understand and appreciate the effort she puts in to raise funds - every pound we spend comes from somebody that puts careful thought into donating and we take that very seriously. When I buy IT equipment I need to ensure it is going to provide benefits that outweigh the costs.” said Briggs.

“When we got these cards they were pretty new to the market and we did over-think how to use them. However, as was with the case with the flash tier on VNX, the correct answer was to allow the cards to work in a ‘vanilla’ fashion. On testing the sample marketing campaign which previously took eight hours to run a query, it ran in 18 seconds,” he said.

Not all campaigns run that quickly, but Briggs said the marketing team can now run more complex queries for a campaign - in some cases up to four times as many - and the more they can run, the better tuned the campaign.

“In simple terms, the technology caches the content of the database on flash on the PCiE bus of the local server and cuts out I/O read to the storage system,” said Briggs.

“Not only did it vastly improve CRM response times, but the business case had already sold itself as the I/O removed from the VNX storage made more storage processing power available for other applications, to such an extent that the card paid for itself four times over simply by freeing up that amount of power for less I/O hungry applications to use,” he said.

“When I can implement hardware that improves performance, improves business processes that help the charity and at the same time saves money, I have made the business case and spent charity money wisely. If that wasn’t the case, I wouldn’t be able to sleep well.”

Powerful performance

The implementation of VNX has improved performance and Briggs said there has been a 35% increase in storage processing power.

The charity now has a “flash first” strategy. It includes VNX, flash drives, and the EMC Fast Suite – consisting of Fast Cache and Fast VP with fully automated storage tiering for virtual pools. The suite automatically tiers data based on its activity level to optimise performance, reducing the need to manage storage tiering manually.

“We did not target individual processes. EMC sized the system by monitoring and looking at what we do with applications then recommending a storage solution to fit. Fast Suite enables us to work out what types of disk to put where. Before, working out where to place data and in which tier was a manual effort. Wherever we have used flash, in every case it has saved us money and increased performance, but you must target it correctly,” said Briggs.

The next step was to look at all flash arrays and following testing Cancer Research UK is now looking to include XtremIO in the strategy.

“There are a couple of things available from EMC that start to feel like the Star Trek moment. When we first looked at XtremIO we looked at the available speed for IOPS - 300,000 IOPS seemed to be a target for issues we were seeing with desktop virtualisation on bulk start up,” said Briggs.

“On our test we worked the available test servers as hard as we could and started up about 300 desktops in a few minutes - but as hard as we tried we couldn’t get it above 60,000 IOPS. So we started to look at what else we could throw at it.

“At first it started to feel like a solution looking for additional problems, but we had missed something pretty fundamental and EMC pointed out that we had taken 8.5TB of desktop virtualisation and, using compression and de-duplication, got that down to around 600GB.

“We have multiple copies of our CRM database so we started to look at what it would mean if we loaded them onto this single 7.5TB X-Brick. In the end we took 40TB of database copies and they all fit. Then we hooked up the existing XtremSF servers and gained the write speed missing from that card and at the same time hardly touched the read IOPS, as this is handled by the server card. Again, 40TB of available space and processing power available for standard applications not requiring that level of performance means that the storage strategy can be amended to remove that disk, save money there, gain performance, and compress the data without losing performance. There is an obvious pattern emerging,” he said.

A new approach to storage

“We have changed the way we utilise storage - now it is flash first. Flash is a new strategy and we did some of it retrospectively, but as we go through our refresh we have incorporated that way of thinking and will continue to do so as we plan ahead,” said Briggs.

Cancer Research UK’s storage strategy was originally the classic hardware approach of, “How much of this do we need?” but Briggs said this causes problems.

“Performance has nothing to do with the maximum storage you can plug into it. The question is, can you actually utilise the data stored on it effectively? All the problems in storage aren’t about the amount of storage disks – it’s about processing power. With flash, you can get the maximum amount of utilisation and solve storage problems,” he said.

In conjunction with these benefits, and a 30% performance increase for many applications, Briggs said it can help with the problem of sorting out bad code, but not replace the need to do so: “Flash can pay for itself. Buying flash is the right strategy to increase application performance and the cost will be justified if you target it correctly - you will keep the business happy.”

Read more on Virtualisation management strategy