Thin provisioning: Over-allocation, wide striping, space reclamation

Learn how the three main advantages of thin provisioning -- over-allocation, wide striping and space reclamation -- can help you make the most of your storage.

What are the advantages of thin provisioning? In what kinds of scenarios does thin provisioning make sense?

Thin provisioning has been available in the NAS space for many years, but is a relatively new concept for SAN storage. The three main perceived advantages of thin provisioning are over-allocation, Wide striping and space reclamation.

Thin provisioning advantage #1: Over-allocation of storage

When designing a SAN and purchasing storage it is sensible to have a good idea of how much storage is required. This may seem like an obvious statement but is often difficult to achieve. Many organisations are not particularly good at Capacity planning, with app owners not certain how much storage capacity a new application will initially require or how it will grow.

Standard storage allocation involves allocating a chunk of the available storage to a particular host. This storage is for exclusive use of the host no matter how much of it is consumed. Once storage has been allocated, it is very rare for it to be removed; so if a host is allocated 100 GB of capacity, it will almost certainly have at least 100 GB allocated to it until it is decommissioned. If the original request was an overestimate of the host's needs (which is often the case), and it only required 40 GB of storage, the remaining 60 GB of storage is wasted.

Thin provisioning allows storage administrators to nominally allocate as much capacity as an app owner thinks they will require from a storage pool without exclusive access. Capacity is only consumed when the blocks are actually written to, while free space is available for any host that has access to the storage pool. In this way hosts can have 1 TB "allocated" to them while the storage pool only contains 500 GB of capacity, of which 300 GB is actually consumed.

There is one caveat to this approach. Many databases use raw volumes, and during the build process initialise the entire raw volume for database usage. The thin-provisioned volume would then report the entire raw volume as used and not allow its use by other hosts, thus negating many of the advantages of over-allocation.

Thin provisioning advantage #2: Automated wide striping

One of the best ways to increase the performance of a particular logical unit number (LUN) is to stripe the LUN over as many drive spindles as possible. Thin provisioning architectures usually involve placing a number of traditional RAID group into a storage pool that is then usable for thin-provisioned volumes. When LUNs are allocated to hosts and data is written it is usually striped over multiple physical RAID groups in the storage pool.

Striping data in this way across multiple RAID groups gives the performance of more spindles to an individual LUN when required. It should be noted, however, that the overall performance available from the drives in the pool will be similar to what would be available if the RAID groups were separated. The difference is that the entire performance is available to every LUN in the pool rather than being available in "performance islands."

Thin provisioning advantage #3: Space reclamation

Many LUNs that are allocated to hosts are poorly utilised and waste a great deal of capacity. For instance, a 100 GB volume that only has 40 GB of data on it wastes 60 GB of storage that could be used by other hosts. Many vendors provide the ability to reclaim this wasted space when a file system on a traditional LUN is migrated to a thin-provisioned storage pool. This empty space would then be merged into the pool and be available for writing to by other hosts.

One of the main pitfalls of thin provisioning is filling up the storage pool. Because thin provisioning "oversubscribes" storage (i.e., it allocates more capacity to hosts than is actually there), it is possible for the storage pool to fill up hosts consuming all the notional storage they are allocated. If this happens, writes will be disabled for all hosts using the pool.

If many hosts are attached to the pool, this could have potentially catastrophic consequences for a business that relies on these hosts. It is therefore imperative to have robust monitoring of utilisation levels for thin-provisioned volumes. Alerts must allow adequate time for additional storage to be purchased and deployed before the remaining capacity of the pool is consumed.

Thin provisioning is an excellent way to maximise the utilisation of capacity in shared storage environments and can help drive down waste. It is particularly suited to environments in which storage requirements may be transient or temporary, and where capacity from the pool is used for a short period of time and then returned to the pool after it is finished with it.

Thin provisioning pools are also useful when data needs to be striped across many more spindles for performance reasons. If you need the performance but are concerned about running out of capacity, then don't over-provision the storage.

Read more on SAN, NAS, solid state, RAID