Block vs file for Hyper-V and VMware storage: Which is better?

Is block- or file-based storage better for Hyper-V and VMware virtual server environments? The answer depends on the precise needs of your virtual server environment.

When it comes to Hyper-V and VMware storage, which is better: block- or file-based access? The rate of adoption of server virtualisation has accelerated over recent years, and virtual server workloads now encompass many production applications, including Tier 1 applications such as databases.

For that reason it is now more important than ever that Hyper-V and VMware storage is well-matched to requirements of the environment. In this article we will discuss the basic requirements for Hyper-V and VMware storage and examine the key question of block vs file storage in such deployments.

Basic requirements for storage in virtual server environments

When selecting storage for virtual server environments, a basic set of requirements must be met, irrespective of the hypervisor or the storage protocol. These include:

  • Shared access. Storage connected to hypervisors typically needs to provide access shared among hypervisor hosts. This enables redundant and high-availability configurations. Where shared storage is implemented for multiple hypervisors, guests can be load-balanced across the servers for performance and availability in the event of a server failure.
  • Scalability. Virtual server environments can include hundreds of virtual machines. This means any storage solution needs to be scalable to cater for the large volume of data virtual guests create. In addition, scalability is required for shared connectivity, providing for multiple hosts with multiple redundant connections.
  • High availability. Virtual server environments can contain hundreds of virtual servers or desktops. This represents a concentration of risk requiring high availability from the storage array. Availability can be quantified in terms of array uptime but also of components that connect the server to the array, such as network or Fibre Channel switching.
  • Performance. Virtual environments create a different performance profile for I/O than that of individual servers. Typically, I/O is random in nature, but certain tasks, such as backup and guest cloning, can result in high sequential I/O demands.

Protocol choice: Block vs file?

Virtual servers can be deployed either to direct-attached storage (DAS) or networked storage (NAS or SAN). DAS does not provide the shared access required of highly available virtual clusters because it is physically associated with a single virtual server. Enterprise-class solutions, therefore, use networked storage and this means protocols such as NFS, CIFS, iSCSI, Fibre Channel and Fibre Channel over Ethernet (FCoE).

File-level access: NAS

Network-attached storage encompasses the NFS and CIFS protocols and refers specifically to the use of file-based storage to store virtual guests. VMware ESXi supports only NFS for file-level access; Hyper-V supports only CIFS for file access. This difference is perhaps explained by the fact that CIFS was developed by Microsoft from Server Message Block (SMB) and NFS was originally developed by Sun Microsystems for its Solaris operating system -- both Solaris and ESXi are Unix variants.

For VMware, NFS is a good choice of protocol as it provides a number of distinct benefits.

  • Virtual machines are stored in directories on NFS shares, making them easy to access without using the hypervisor. This is useful for taking virtual machine backups or cloning an individual virtual guest. VMware configuration files can also be directly created or edited.
  • Virtual storage can easily be shared among multiple virtual servers; VMware uses a locking file on the share to ensure integrity in a clustered environment.
  • No extra server hardware is required to access NFS shares, which can be achieved over standard network interface cards (NICs).
  • Virtual guests can be thinly provisioned, if the underlying storage hardware supports it.
  • Network shares can be expanded dynamically, if the storage filer supports it, without any impact on ESXi.

There are, however, some disadvantages when using NFS with VMware.

  • Scalability is limited to eight NFS shares per VMware host (this can be expanded to 64 but also requires TCP/IP heap size to be increased).
  • Although these NFS shares can scale to the maximum size permitted by the storage filer, the share is typically created from one group of disks with one performance characteristic; therefore, all guests on the share will experience the same I/O performance profile.
  • NFS does not support multipathing, and so high availability needs to be managed at the physical network layer with bonded networks on ESXi and virtual interfaces on the storage array -- if it supports it.

For Hyper-V, CIFS allows virtual machines (stored as virtual hard disk, or VHD, files) to be stored and accessed on CIFS shares specified by a Uniform Naming Convention (UNC) or a share mapped to a drive letter. While this provides a certain degree of flexibility in storing virtual machines on Windows file servers, CIFS is an inefficient protocol for the block-based access required by Hyper-V and not a good choice. It is disappointing to note that Microsoft currently doesn’t support Hyper-V guests on NFS shares. This seems like a glaring omission.

Block-level access: Fibre Channel and iSCSI

Block protocols include iSCSI, Fibre Channel and FCoE. Fibre Channel and FCoE are delivered over dedicated host adapter cards (HBAs and CNAs, respectively). iSCSI can be delivered over standard NICs or using dedicated TOE (TCP/IP Offload Engine) HBAs. For both VMware and Hyper-V, the use of Fibre Channel or FCoE means additional cost for dedicated storage networking hardware. iSCSI doesn’t explicitly require additional hardware but customers may find it necessary to gain better performance.

VMware supports all three block storage protocols. In each case, storage is presented to the VMware host as a LUN. Block storage has the following advantages.

  • Each LUN is formatted with virtual machine file system, or VMFS, which is specifically written for storing virtual machines.
  • VMware supports multipath I/O for iSCSI and Fibre Channel/FCoE.
  • Block protocols support hardware acceleration through vStorage APIs for Array Integration (VAAI). These hardware-based instructions improve the performance of data migration and locking to increase throughput and scalability.
  • ESXi 4.x supports “boot from SAN” for all protocols, enabling stateless deployments.
  • SAN environments can use RDM (Raw Device Mapping), which enables virtual guests to write non-standard SCSI commands to LUNs on the storage array. This feature is useful on management servers.

For VMware, there are some disadvantages to using block storage.

  • VMFS is proprietary to VMware, and data on a VMFS LUN can be accessed only through the hypervisor. This process is cumbersome and slow.
  • Replication of SAN storage usually occurs at the LUN level; therefore, replicating a single VMware host is more complex and wasteful in resources where multiple guests exist on the same VMFS LUN.
  • iSCSI traffic cannot be encrypted and so passes across the network in plain view.
  • iSCSI security is limited to CHAP (Challenge Handshake Protocol), which isn’t centralised and has to be managed through the storage array and/or VMware host. In large deployments this may be a significant overhead in management.

Hyper-V is deployed either as part of Windows Server 2008 or as Windows Hyper-V Server 2008, both of which are Windows Server variants. Therefore virtual guests gain all the benefits of the underlying operating system, including multipathing support. Individual virtual machines are stored as VHD files on LUNs mapped to drive letters or Windows mount points, making them easy to back up or clone.

Summary

NFS storage is suitable only for VMware deployments and is not supported by Hyper-V. Typically, NAS filers are cheaper to deploy than Fibre Channel arrays, and NFS provides better out-of-band access to guest files without the need to use the hypervisor. In the past NFS had been used widely for supporting data like ISO installation files, but today it has wider deployments where the array architecture supports the random I/O nature of virtual workloads.

CIFS storage is supported by Hyper-V but is probably best avoided in preference of iSCSI, even in test environments; Microsoft has now made its iSCSI Software Target freely available.

Block-based storage works well on both virtualisation platforms but can require additional hardware. Directly accessing data is an issue for iSCSI/Fibre Channel/FCoE, making data cloning and backup more complex.

Overall, the choice of platform should be considered against your requirements. There are clearly pros and cons with either a file- or block-based approach, each of which can coexist in the same infrastructure. There’s no doubt that both will find homes with server virtualisation for many years to come.

Read more on Data protection, backup and archiving