luchschen_shutter - Fotolia

Storage performance testing vital, says ex-Morrisons IT chief

Former Morrisons head of technology says testing and monitoring storage performance is key to avoiding outages and getting the benefits from infrastructure upgrades

Organisations need to regularly monitor storage performance or risk catastrophic failures. Testing is also vital during times of infrastructure change to ensure upgrades meet performance improvement targets.

Those are the views of Simon Close, an IT consultant who was head of technology at supermarket group Morrisons until February 2017.

Close led a £10m storage transformation programme at Morrisons in which a multiple supplier storage environment comprising different Dell, EMC, and HP arrays were consolidated to HP 3PAR arrays. At the same time, three existing backup systems were rationalised to Commvault.

“Morrisons’ storage environment had grown organically and was complex and difficult to manage. That meant significant costs and high TCO [total cost of ownership] and maintenance costs,” said Close.

During the migration, Close’s team benchmarked performance before and after hardware changes using Virtual Instruments’ monitoring and testing product.

Virtual Instruments made its name with its Fibre Channel SAN testing probes. The company’s products use hardware and software monitoring to build a picture of server and storage operational efficiency. In 2016, the company merged with storage traffic load-testing provider Load Dynamix and used IP from that acquisition to add NAS monitoring capability to existing SAN functionality.

“We wanted to de-risk the whole migration by benchmarking the performance of key applications – about 600 in total – before moving them off their existing storage hardware. We wanted to demonstrate at least the same or better performance post-migration and achieved improvements between 2x and 30x,” said Close.

He recommends measuring storage metrics such as read and write response times, throughput and input/output operations per second (IOPS) before and after significant hardware and software changes. taking into account likely peak processing periods, which were, at a supermarket, times such as Christmas, Easter and financial year-end.

“You’ve got to measure the current performance of your environment [before a migration] or you’re not setting yourself up for success. Metrics before and after help you make sure you’re achieving successful outcomes and are evidence for third parties that need to see it,” he said.

Read more about storage performance

Close also recommends ongoing monitoring of storage environments to ensure a proactive approach against potential issues.

“In any storage environment there will be errors and noise that can go undetected for months and years. These can blow up and become catastrophic and service impacting,” he said.

“You don’t want to be sat with your fingers crossed. If you are proactively monitoring your environment you can see errors clocking up and resolve things, taking an outage at a time that’s convenient for the organisation.”

Close spoke in particular about cyclic redundancy check (CRC) errors, in which data is corrupted and the cause can be fairly simple physical issues.

“A single CRC error is one too many. In a SAN environment, CRC errors will cause re-transmission. By knowing errors, their trends and being proactive, you can stop them happening. Lasers degrade, SFPs fail, so it pays to keep an eye on them and replace or even just clean optical components.”

Read more on SAN, NAS, solid state, RAID