Virtual machine storage and block vs file

Virtual machine storage requires greater performance, capacity and resiliency than with physical servers and requires a careful choice between block and file access.

Virtual machine storage is a demanding business. Where once one app server equalled one physical server and the demands on storage were known and predictable, with virtual machines, storage I/O has become massively random owing to the many instances of virtual servers that can live in one box. For that reason you now need to plan more carefully for the provision of storage to virtual servers.

In this interview, Bureau Chief Antony Adshead speaks with Chris Evans, an independent consultant with Langton Blue, about the requirements of virtual machine storage and whether block- or file-based access is best suited to a virtual server environment.           

Read the transcript or listen to the podcast on virtual machine storage.

Play now:
Download for later:

Virtual machine storage and block vs file

  • Internet Explorer: Right Click > Save Target As
  • Firefox: Right Click > Save Link As What are the storage requirements of virtual servers?

Evans: We can divide this question into three categories: capacity, performance and resiliency.

Let’s start with capacity. Clearly, if we’re deploying many virtual machines in an environment, we’re going to want to have a large amount of storage in which to put them. So, first of all we need storage arrays to give us large scale and that can run into tens or hundreds of terabytes. We also need that storage to support large LUNs. As we know, VMware only allows a relatively small number of LUNs per cluster -- around 256 -- so clearly we need to have support for large LUNs. And now within vSphere 5 we’re talking about LUNs that can go larger than 2 TB. The current limit for a single VMFS LUN is 64 TB.

The second thing we want is performance. Virtual environments generate much more random I/O workload, and therefore we need storage that can support that random I/O performance. Tied to that, we want the ability to achieve high volumes of I/O, so if we’re cloning or copying virtual machines or moving [them] around within an environment, we need to be able to guarantee that that array can support high performance.

Thirdly, we need resiliency. What we intend to do with a virtual environment is concentrate large numbers of virtual machines together, and they may have sat on a large number of physical servers before that, so we’re putting a higher degree of concentration into one small area, and we need these arrays to be highly available. We need good uptime, and we need make sure we can run them 24 hours a day. What’s best for virtual machine storage: block or file access?

Evans: This question goes on and on. Lots of people say they prefer NFS; others say they prefer block. It’s worth looking at what the two standards give us and why we choose one over the other.

With a block-based array we can use [one of the] traditional protocols, such as iSCSI, Fibre Channel or FCoE. We know they’re highly reliable [and give us] features such as multipathing, reliable delivery and so on. Block environments are very good and very stable.

We know that through some of the advanced features that have come into the environment recently. Through VAAI (vStorage APIs for Array Integration), we have the ability to offload some of the heavy lifting as well, so we can move off some of the cloning and copying to the back-end array.

If we look at the way NFS works, [it] doesn’t actually have a file system that gets formatted on it in the same way that VMFS does with block storage. Rather, we use the NFS file system itself. As a consequence, that means that we can directly access the files that make up that virtual machine. This allows us to copy them, to edit them, or even perhaps to clone them or replicate them. That’s great for, for instance, creating clones for virtual machines and in the thin provisioning environment, we can create large numbers of clones quite quickly and cheaply in terms of storage.

A lot of people use NFS for disk images because they see that as a great way to drop files onto an environment where they can then map those to virtual machines.

NFS did have scalability issues until recently. Version 3 of ESX only allowed 32 NFS data stores to be mapped to a server. That’s now been expanded to 256 so that scalability issue has been taken away.

So, really the question of whether we should use block or NFS is really a question of what you want to use it for and what your requirements are. Both are equally suited, and we see all the time that one leapfrogs the other in terms of features. There’s probably not one clear leader -- it depends on exactly how you’d like to use it.

Read more on Storage fabric, switches and networks

Data Center
Data Management