This is part one of a four-part series on the updates in VMware vSphere 4.1. You can navigate through the series using the links provided below.
VMware recently re-released its flagship platform vSphere 4.1, adding to it additional functionality. I was given exclusive access to both the product binaries and attended a series of five "webex" sessions prior to the launch. I will share with you the core new functionality and provide an assessment of which features will make a significant impact on data centres when they adopt the new version.
First, for all of you who learned how to set up the core components of vSphere 4.0 (ESX and vCenter), you will be pleased to hear that the install routines are exactly the same in vSphere 4.1. This should mean any automated build-out you have of vSphere 4 should work without modifications. The only change to the install routine is in vCenter, where there is a new dialog box that allows you to allocate a chunk of RAM to the Java Virtual Machine Web Service which drives VMware Web Services.
IO Storage Control (IOSC)
By now you might already have heard of this feature because VMware kind of broke their own rule and began talking about it on YouTube and various other portals. Storage IO control deals with situations where you have a number of VMs on the same LUN/volume and one VM affects the performance of another. In the past your options were limited to either setting a "share" value on each VMDK that made up a VM, or using "Storage vMotion" to balance out your IOPS across a number of LUNs. This would take some planning upfront. SIOC adds an additional control to the properties of a VM, which allows the administrator to set an allocation of I/O operations per second. Strictly speaking, these properties are not directly part of SIOC, as they will function even if it is not enabled on the datastore. So try to see these limits as yet another control mechanism to prioritise your VM storage workloads. Prior to the introduction of SIOC, these controls were only applied to the VM on a specific ESX host. This new control is there to address a weakness in vSphere 4.0's "shares"-only feature, in which a single VM could potentially hog throughput. Additionally, in the current implementation, the shares settings only apply to VMs on the same host. With these settings, the configuration is now applied globally to cover the situation where two VMs are on the same datastore but running on two different ESX hosts.
For the real SIOC feature to take effect, the option must be enabled on the properties of the datastore. This setting currently only applies on VMFS volumes based on local or SAN based storage.
Once enabled on the datastore and the virtual machine, these values can be viewed (and also changed) from the properties of the datastore and virtual machine tab like so:
While this is a welcome additional control to the virtual machine, there will be many old-school VMware admins who say that anyone needing to set these parameters has probably not thought out their VM to LUN allocations. With that said, even with the best in the world no admin can plan for all eventualities, and it's better to have some controls in place to deal with unexpected spikes in storage IO than none at all.
This method of managing bandwidth reduces cabling hell, letting admins carve up blocks of bandwidth for both Ethernet and storage networking.
Mike Laverick, Contributor,
If you do have two very disk-intensive VMs which are competing for I/O time, then perhaps they are better located on different LUNs backed by different spindles. The key to stopping contention on any resource in virtualisation is not to create it in the first place. In the longer term, these controls might seem somewhat tangential compared to being able to migrate virtual machines from one tier or class of storage to another on demand as can be seen in EMC's FAST technology. With that said, even with enhancements and enlightenments in the storage layer, you may still need to tell ESX, using SIOC, which VMs are more important to you than others. It may be the case that you will need both SIOC and storage enhancements like EMC FAST to get the most out of your storage layer.
It is salutary to remind everyone that just like VMware's "share" technology, these IOPS limits only apply when LUN experiences contention and I/O is limited. Storage I/O control only gets triggered when the latency is below a sustained value of 30 seconds.
Network IO Control (NIOC)
In the same spirit of SIOC, VMware has introduced its sister technology – NIOC. The important thing to note here is that this feature is only available to distributed vSwitches, which limits the technology to Enterprise Plus customers only, which for some will kick this feature into the long grass from the get go. When you create a dvSwitch for the first time, you will find that there are now two types (4.0 and 4.1.0). If you want the new advanced features you will need to pick the 4.1.0 type.
Fortunately for existing users, there is an "upgrade" link that appears on the main "Summary" page for an existing dvSwitch that allows you to upgrade seamlessly to the new type. As ever with VMware, there isn't a "downgrade" option to move your switch in reverse gear.
VMware is pitching the new NIOC feature at customers who are upgrading to 10 GB, where the advantages of NIOC value really are more palpable. Typically, in a 1 GB environment, customers dedicate "vmnics" to particular applications such as VMotion, Storage, Fault Tolerance and VMs for both security and performance. In a 10 GB environment, the server might only contain two 10 GB cards from which the administrator makes an NIC team – and NIOC allows you to control the aggregated bandwidth for different traffic types with ESX.
As with SIOC, its network equivalent imposes controls in the form of shares and limits. As with limits on CPU/Memory/Storage, it represents an absolute amount of bandwidth for a team that is specified in Mbps. Currently, these controls only apply to outbound traffic (TX) or what's referred to as "Egress" in a dvSwitch. The scope of these controls does vary – the share setting (which is like a priority value: low, normal, high and custom) applies to the physical "vmnic," whereas the limit value applies to the whole team. Essentially, these controls allow the administrator to set values on what VMware have dubbed the "Network Resource Pool." Unlike resource pools in DRS, these are pre-created by the dvSwitch and used to represent different functionality in vSphere. They are not created by the administrator.
In the screen grab below, you can see the total network bandwidth to the dvSwitch is 4000Mbps. This is a dvSwitch with two ESX hosts added – each with two 1 GB NICs (4x1 GB in total). Enabling NIOC is very simple. You hit a properties option in the "Resource Allocation" tab, and then right click each network resource pool and set either the share or limits value (or both!) to the traffic type in question. This marks a significant improvement of setup where previously the administrator had to tinker with the "traffic shaping" options of each port group on the dvSwitch.
Again, these controls are a welcome enhancement to the vSphere product, although their usefulness is debatable.
Once again VMware has shown itself head-and-shoulders above the competition in terms of exploiting improved performance from their virtualisation layer.
Mike Laverick, Contributor,
First, to gain access to them you must purchase the most expensive SKU available from VMware. You could quite easily argue that this isn't a huge barrier considering the Enterprise Plus customers are the very same folks likely to be the ones rolling out 10 GB networking first. Second, I personally think the way forward is with converged networking or virtual I/O models that allow you to do this kind of bandwidth management at the network layer. This method of managing bandwidth reduces cabling hell, and allows the administrator to carve up blocks of bandwidth for both Ethernet and storage networking.
Most shops that are considering rolling out 10 GB are assessing this technology. I think they will be looking at a solution to manage this bandwidth that is independent of the hypervisor becuase network convergence has applications above and beyond just virtualisation.
Memory compression is yet another series of RAM technologies that join VMware's stable of controls including memory over-commitment, transparent page sharing (TPS) and the "balloon driver".
The memory compression works in a similar way to TPS in that it is seamless to the virtual machine, and takes blocks of memory that haven't been used recently and compresses the data down to recoup the memory blocks to the wider system. Of course, there is overhead in decompressing that data if it needs to be modified, but compared to swap activity to disk, this decompression is nothing. This low-latency swap to decompressed memory shows dividends under situations where there are a lot of VMs running on the same ESX host. The memory compression feature adds additional performance counters to the popular "esxtop" or "resxtop" utilities, and adds Mem, MemZip, and Mem.MemSwapSkipPct parameters to the advanced settings, although VMware recommends not changing these values.
Building on top of vSphere 4.0 improvements generally, vSphere 4.1 once again increases the number of simultaneous VMotions (not a medical condition) to 4 VMs on a 1 GB network, and 8 on a 10 GB network with up to 128 ESX hosts pointing to the same shared NFS or VMFS volume. This should speed up the process of entering maintenance mode on an ESX host, and accelerate other features that are dependent on it such as VMware Update Manager and Distributed Power Management.
Related to VMotion, vSphere 4.1 introduces updates to the Enhanced VMotion Compatibility feature when DRS feature is enabled on a VMware Cluster. VMware has introduced new masking parameters for the fact that future AMD processors will lose the 3D!Now attribute, which, if not dealt with at this stage, could introduce a new CPU incompatibility for customers who have chosen the AMD processor family.
Additionally, the EVC feature now examines the live VMs CPU feature set, which should allow EVC to locate an ESX host that can provide the CPU attributes in use. Theoretically, this should reduce errors and aid the troubleshooting process within VMotion.
Once again VMware has shown itself to be head-and-shoulders above the competition in terms of exploiting improved performance from their virtualisation layer, to such an extent that it makes you wonder what else they have left to conquer in terms of improvements when the only real barrier left to virtualisation from a performance perspective is whether you can afford the hardware to support the larger workloads.
Perhaps this is what VMware's erstwhile performance guru, Scott Drummonds, had in mind when he recently bid VMware adieu to join EMC and their vSpecialist team. To some degree the debate has moved away from who has the best performing virtualisation layer to who has the best features, at the best price, and who is able to leverage new enhancements in the four core resources.
Editor's note: In order to deliver this content in as timely a fashion as possible, the author used the BETA product to take the screen grabs. Please note these may be slightly different in the GA release. We will update the graphics as soon as possible.
ABOUT THE AUTHOR: Mike Laverick is a professional instructor with 15 years of experience with technologies such as Novell, Windows and Citrix, and has been involved with the VMware community since 2003. Laverick is a VMware forum moderator and member of the London VMware User Group Steering Committee. In addition to teaching, Laverick is the owner and author of the virtualisation website and blog RTFM Education, where he publishes free guides and utilities aimed at VMware ESX/VirtualCenter users. In 2009, Laverick received the VMware vExpert award and helped found the Irish and Scottish user groups. Laverick has had books published on VMware Virtual Infrastructure 3, VMware vSphere 4 and VMware Site Recovery Manager.