Delivering efficient and optimised storage performance for virtual environments can be difficult to achieve because so many components in the infrastructure stack must interact.
But optimum storage performance can be achieved with a starting point that begins with a set of metrics to measure storage operations in the virtual environment.
Latency (also called response time) describes the time taken to complete a single I/O (input output) operation and is essentially a measure of how fast a storage system responds to read and write requests.
Values are usually measured in milliseconds, with the fastest flash drives now quoting fraction of a millisecond figures. In an ideal world, latency would be zero, resulting in no penalty to the application for read/write operations to permanent storage media. But, physics gets in the way and some latency exists for every I/O operation.
The aim for any storage solution is to minimise latency values, as storage is often the bottleneck in IT infrastructure. Lower latency means less waiting for the completion of I/O and therefore results in faster execution.
In virtual environments latency has a direct impact on the speed at which virtual machines (VMs) and desktops operate and reduced latency means more efficient use of processor and memory.
As a result, we have seen the adoption of solid-state storage into virtual environments and the movement of I/O management into the server. Here flash-caching hardware and software aims to eliminate the need for data to traverse the network, thereby producing very low latency values.
Throughput – The capability of a storage system to transfer a fixed amount of data in a measured time is known as throughput, or bandwidth. Typically, throughput is measured in megabytes per second (MBps) or similar units.
Storage arrays and disk devices can be measured for two throughput metrics - sustained throughput and peak throughput. Sustained throughput is a measure of the constant capability of a device or system over a long period of time. Peak throughput indicates the levels a system can provide over short periods.
Peak throughput levels are important in VDI (virtual desktop infrastructure) environments, where boot storms - when many users log into the system and start up their virtual desktops at the same time - can generate huge I/O demand, resulting in poor performance and a rise in latency if the system cannot manage the spike effectively.
Good throughput numbers are also essential for virtual server environments when managing the dynamic movement of VMs between datastores. Being able to measure throughput and understand peak demand is critical to virtual environments.
IOPS (input output operations per second) – IOPS is a measure of the number of individual read/write requests a storage system can service per second. This figure is closely related but subtly different to throughput.
In many cases, vendors will use IOPS as a measure of the performance of their products, but these figures need to be considered alongside the size of data chunks being transferred in each operation.
For example, many small (say, 4KB) requests are easier to handle than large (1MB) ones. Also reads, especially of random rather than sequential datasets, are generally more time-consuming than writes. So, IOPS claims need to be judged against the volume of data chunks being dealt with and the type of operations they refer to.
Read more on storage performance
The relationship between latency, throughput and IOPS
Latency, IOPS and throughput are all closely-related. A storage system that can deliver I/O at very low latency will be able to deliver a high IOPS performance. At the most basic level, the number of IOPS achievable is simply 1/latency, so a latency of 3 milliseconds (or 1/0.003) translates to around 333 IOPS.
A system that can deliver a high number of IOPS with large data chunks will be able to deliver a high throughput, as the value is simply the number of IOPS multiplied by the I/O size.
Of course, vendors make claims to very high numbers when quoting performance figures and there are some good reasons for this. Storage arrays are able to manage I/O workload with parallelism or concurrency, handling more than one I/O operation at the same time.
Concurrency is achieved by providing multiple paths to the storage system and using system memory as a cache to queue transactions. This leads us to a new measurement – queue depth – that describes how many I/O requests a device can handle simultaneously.
A single disk drive will have a queue length in single or double figures, whereas a large enterprise array will provide a queue depth into the tens or hundreds per LUN, per port or a combination of both.
By queuing multiple requests together, a storage device can optimise the write process, reducing some of the physical latency of storing data, which is particularly efficient with spinning disk hard drives because head movement can be significantly reduced.
Workload profiles and where to measure
Identifying and recording metrics provides the raw data for understanding storage performance, but any numbers gained need to be considered in context in terms of I/O profile and where the measurements are taken. This is because all applications produce different workload demands.
As an example, VDI and virtual server traffic are highly randomised due to the dispersal of active data throughout a datastore or volume storing virtual hard disks. VDI data is typically mostly read-heavy (80R/20W as a percentage split or higher); so low read I/O latency gives a significant performance boost.
Choosing where metrics are recorded is also important to provide an end-to-end view of I/O performance.
In the days of the mainframe, a single I/O operation could be tracked from start to finish, showing where delays occurred at each step of the journey. Today, I/O is much more complex, with measurements possible at the storage device, within the hypervisor and within the host itself.
There’s no right or wrong place to take measurements; each gives a perspective on the operation of the system. Values taken from the array show how well the external storage copes with demand. Values taken from the host, for example, show how contention at the datastore affects individual guest performance while values taken from the hypervisor show the effectiveness of the storage network.
Both the common hypervisors (vSphere ESXi and Hyper-V) allow for the movement of workloads to optimise storage performance. Storage DRS for vSphere, for example, bases VM migration recommendations on historical I/O latency measurements of the underlying datastores. Meanwhile, Intelligent Placement in Hyper-V makes calculations based on VM IOPS.
Let’s not forget
Most of the numbers we have talked about are technology-based, but there are financial metrics to consider when purchasing products and placing data on tiers of storage. These include $/GB, which measures cost per unit of capacity. But, with the introduction of flash, where $/GB is so much higher than with spinning disk, $/IOPS can have more useful meaning when application performance is a greater consideration.
Putting it all together
Due to the random nature of virtual environments, latency is a key metric in monitoring the status of physical storage resources. Latency is relevant, whether a system contains one or one hundred VMs.
When we look at the capacity to support many VMs, throughput plays a big role, because the ability to scale a virtual environment requires corresponding capability in throughput. As already discussed, managing peak demand can be an issue for VDI environments which have peak read and write load periods.
From the perspective of the host, IOPS is typically used as the standard measure, because this provides an abstracted view that isn’t dependent on the underlying hardware capabilities. This is seen as a measure in both private and cloud virtual infrastructures.