The concept of thin provisioning in storage has been around for longer than many people think. I first used space-efficient volumes in the early 1990s on StorageTek’s Iceberg storage array, later resold by IBM as the RAMAC Virtual Array.
The idea of thin provisioning is a simple one. On a traditional array, creating a logical unit number, or LUN, reserves the entire capacity of that volume, whether the host fully utilises that capacity or not. In many cases utilisation by the host can be as low as 30%, and represents a significant waste of space.
Thin provisioning makes volume creation more efficient by reserving physical capacity on the array only when the host actually writes data to the logical volume. The result is that (subject to a little management) significant cost savings can be made using thin provisioned LUNs.
The question therefore is which is more efficient. Should you use thin provisioning on the array, in the hypervisor, or even both?
Array-based thin provisioning
The most obvious benefit of using thin provisioning in the array is to save on physical capacity. Imagine 100 LUNs, each of which is 2TB and 75% utilised by host servers. A traditional array would require 200TB of physical storage and waste 50TB of that allocation. A thin provisioned array could allocate a further 33 new 2TB volumes (at 75% utilisation) before running out of physical space.
More on thin provisioning
As well as saving on capacity, the ability to effectively create many volumes on an array with minimal overhead provides significant administrative benefits.
Until we see VMware’s VVOLs arrive in production, failover of virtual machine guests from one array to another occurs at the volume or LUN level. Thin provisioning allows volumes to be created for administrative purposes, grouping related hosts together on the same volume for failover without unduly wasting disk space. The same principle applies to snapshots when taken at the array level.
Following a similar principle, when there is no restriction on the logical size of a volume, datastores can be created at their maximum likely size when first allocated, removing the need to resize in the future (and incur the impact of maintenance windows in the process).
Storage arrays with thin provisioning are now starting to see additional functions being deployed such as data deduplication. Implementing space efficiency features (including thin provisioning) can result in significant space savings, especially in VDI deployments.
Finally, one significant benefit can be that of performance. Thin LUNs are usually created across many RAID sets or disk groups within an array, resulting in higher performance than thick LUNs due to the use of lots of disk drives to service a single LUN. This benefit is, however, becoming less important as we move to all-flash and hybrid flash arrays and as dispersed data layouts become the norm.
Hypervisor-based thin provisioning
Both Microsoft Hyper-V and VMware vSphere provide the ability to thin-provision virtual machine host storage.
Hyper-V stores virtual machine data in VHDs or virtual hard disks, which are individual files that represent a disk volume. VHDs (or the newer VHDXs) can be allocated as fixed or dynamic, representing fully allocated and thin-provisioned formats respectively. Dynamic VHDs are allocated with minimal space overhead (configuration and mapping details are stored in the header and footer of the file) and expanded as the host writes to the disk.
As would be expected, dynamic VHDs are more efficient than fixed VHDs. However, some housekeeping is necessary to keep dynamic VHDs efficient as deleted data is not reclaimed from the VHD unless a compact operation is performed – which can only be done with the VM shut down.
Dynamic VHDs also have a performance overhead compared to fixed VHDs (as a result of mapping logical to physical location for each block of data) that can result in a 10% to 15% increase in I/O latency, making them less suitable for high-performance applications.
VMware vSphere implements thin provisioning within a datastore by using different types of virtual machine disk (VMDK) formats. At VM creation time, VMDKs can be allocated as one of three types:
- thin (space allocations are assigned and zeroed out at write time)
- zeroedthick (space allocation is assigned at creation time)
- eagerzeroedthick (space allocation is assigned at creation time and zeroed out)
The thin format offers the most efficient use of space but incurs a performance overhead as each new block of data is written. The other two formats reserve all space within the datastore, optionally overwriting all the reserved space with binary zeroes in the case of eagerzeroedthick.
Thin VMDKs ensure space is allocated on demand to each VM as it is used. However, space isn’t reclaimed automatically when virtual machines are deleted or migrated to other datastores.
VMware disabled the reclaim function, which was automated on first release, as it caused performance problems with some array vendors' hardware. On vSphere ESXi 5.5, released physical space can be reclaimed using the 'esxcli storage vmfs unmap' command, if supported by the array. The alternative is to storage vMotion the guest to another datastore, which has the effect of reorganising the thin-provisioned VMDKs back to an efficient state.
The right method
Thin provisioning has to be considered in two modes of operation; there are initial allocations (which can be very efficient), but there is the ongoing 'maintenance' of a volume, the so-called 'stay thin' process.
As data is created and destroyed on volumes by the host, fragmentation of the file system means that physical blocks of space remain assigned to a volume even though it is no longer allocated by the host or guest VM.
I would always choose to use thin provisioning at both the array and host level.
Array-based thin provisioning that implements 'zero page detect' (releasing empty blocks of binary zeroed data) allows savings to be made with little maintenance overhead, providing 'stay thin' benefits .
Making savings from hypervisor-based thin provisioning requires some maintenance and planning but positive savings can still be delivered and benefits can be automatically realised when VMs are, for example, vMotioned between datastores.
Whichever method (and you can use both) you decide to use, ensure you have decent tools and reporting so you understand exactly what savings can be made by running cleanup processes, especially those that require VMs to be shut down. It’s always important to know what gain you will get from the pain.