Server virtualisation has transformed the datacentre, spurring growth and consolidation in x86 workloads. And as virtualised infrastructures have matured, the ecosystem surrounding them has moved to meet the requirements of organisations that want to offer their users a service provider-like environment – the private cloud.
OpenStack is an open-source project that delivers an entire ecosystem to deliver private cloud infrastructure functionality.
OpenStack is built up from many different modules that cover virtual machines/compute (Nova), object storage (Swift), block storage (Cinder), networking (Neutron), dashboard (Horizon), identity services (Keystone), image services (Glance), telemetry (Ceilometer) and orchestration (Heat).
Storage functionality is provided by three of those components.
Swift is the sub-project that delivers object storage. It provides similar functionality to Amazon S3 – more of which later. Cinder is the block-storage component, delivered using standard protocols such as iSCSI. Glance provides a repository for VM images and can use storage from basic file systems or Swift.
OpenStack evolved from collaborative work between Rackspace and NASA and has gained popularity as a platform on which to develop scale-out applications. Today we see it used by service providers to deliver public clouds and by large organisations to develop private cloud infrastructure.
To date, more than 200 companies have become involved with OpenStack, providing funds and resources to develop the project. Since the first release (codenamed Austin) in 2010, which only supported object storage, features have gradually been added, with Cinder support appearing in the Folsom release in 2012. The availability of Cinder has enabled traditional and startup storage suppliers to adapt their products to deliver storage into OpenStack environments.
OpenStack Cinder for block storage
Block storage is a fundamental requirement for virtual infrastructures. It is the foundation for storing virtual machines and data used by those machines. Before block storage support was available, OpenStack virtual machines used so-called ephemeral storage, which meant the contents of the virtual machine were lost when that VM was shut down.
Cinder is the OpenStack component that provides access to, and manages, block storage. To OpenStack hosts, the storage appears as a block device that uses iSCSI, Fibre Channel, NFS or a range of other proprietary protocols for back-end connectivity.
The Cinder interface specifies a number of discrete functions, including basic functionality such as create volume, delete volume and attach. There are also more advanced functions that can extend volumes, take snapshots, clone and create volumes from a VM image.
Many storage array suppliers now provide Cinder block device support. These include EMC, Hitachi Data Systems, HP, IBM and NetApp. There is also considerable support for Cinder from startups, including SolidFire, Nexenta, Pure Storage and Zadara Storage.
Most suppliers provide support for iSCSI, with some including Fibre Channel and NFS connectivity.
OpenStack implementations are usually built around scale-out server configurations, so Fibre Channel is not the best choice of protocol. It is likely to be expensive and complex to implement due to hardware costs and the issues of scaling Fibre Channel over large numbers of storage nodes.
NFS support was introduced with the Grizzly release of OpenStack, although it had been brought in experimentally with Folsom. Virtual machine volumes under NFS storage are treated as individual files, in a similar way to the implementation of NFS storage on VMware or VHDs on Hyper-V.
More on OpenStack storage
By encapsulating the virtual disk as a file, systems that are able to perform snapshots or other functions at the file level can use this as a way of implementing features such as cloning.
Some startup suppliers support Cinder using their own protocols, for example Scality and Coraid. There are also open-source storage solutions from Ceph and GlusterFS that provide Cinder support using the Ceph RADOS Block Device (RBD) and the native GlusterFS protocol, respectively.
The Ceph implementation is interesting because it uses code that has already been integrated into the Linux kernel, making configuration and support easy to implement. Ceph can also be used as a target for Glance VM images.
OpenStack Swift object storage
Object stores reference data as binary objects (rather than as files or LUNs), typically storing or retrieving the entire object in one command. Objects are stored and referenced using HTTP (web-based) protocols with simple commands such as PUT and GET.
In the case of Swift, objects are physically stored on “object servers”, one of a number of entities that form a “ring” that also includes proxy servers, container servers and account servers.
A ring represents components that deliver the Swift service. These server components manage access and track the location of objects in a Swift store. Metadata is used to store information about the object. Swift uses the extended attributes of a file to store metadata.
To provide resilience, rings can be divided into zones, within which data is replicated to cater for hardware failure. By default, three replicas of data are created, each stored in a separate zone. In the context of Swift, a zone could be represented by a single disk drive, a server or a device in another datacentre.
Swift uses the idea of eventual consistency when replicating data for resilience. This means data is not replicated synchronously across the OpenStack cluster, but rather duplicated as a background task. Replication of objects may fail or be suspended if a server is down or the system is under high load.
The idea of eventual consistency may seem risky and it is possible that, in certain scenarios, data may be inconsistent if a server fails before replicating to other nodes in the OpenStack cluster. It is the job of the proxy server in Swift to ensure I/O requests are routed to the server with the most up-to-date copy of an object and, if a server is unavailable, to route the request elsewhere in the cluster.
As Swift has developed, new and enhanced features have been added to the platform.
The Grizzly release of OpenStack provided more granular replica controls, allowing rings to have adjustable replica counts. It also introduced the idea of timing-based sorting for object servers, allowing read requests to be served by the fastest-responding Swift server. This is especially useful in designs that distribute Swift servers over a WAN.
Of course, because Swift is implemented using the HTTP protocol, there is no requirement to store data locally and it would be perfectly possible to store object data in another platform, such as Cleversafe, Scality or even Amazon S3.
Swift provides the ability to deliver resilient scale-out storage on commodity hardware and this may be more preferable and cost-effective than using an external solution.
Choosing OpenStack storage
Obviously, object and block storage have very different characteristics, making them suited to storing different types of data.
Swift was designed to be a highly scalable, eventually consistent object store and so it is suited to storing large volumes of data, such as images, media or even files.
It is analogous to Amazon’s S3 platform and uses similar protocols and commands to store and retrieve data. It also provides support for features such as versioning, keeping track of multiple copies of a single object where required.
Swift is not suited to storing virtual machines because it only reads and writes entire objects without any guarantee of consistency.
Cinder, meanwhile, provides block storage interfaces that deliver more traditional storage resources and is most suitable for storing persistent copies of virtual machines in an OpenStack environment.
Cinder can be supported on local servers with in-built support for open-source projects such as Ceph and GlusterFS. It is also possible to use built-in tools, such as server logical volume managers or NFS servers, to provide storage to an OpenStack cluster.
Where more resilient or scalable block solutions are required, external storage arrays can be used, with many suppliers providing Cinder drivers.
The efficiency of supplier implementations will vary by platform. For example, SolidFire’s scale-out storage system is totally API-driven, making the Cinder driver implementation very lightweight. Other legacy hardware providers may not have such simple implementations, in some cases using SMI-S as the interface into the storage.
Using an external storage array to provide Cinder support provides these benefits: performance and availability can be managed by the array, and the array can be used to deliver storage savings through features such as thin provisioning, compression and data deduplication.
Of course, not all supplier implementations are equal and some may require more integration work than others, so it is worth validating with your chosen supplier exactly how Cinder support is implemented.
OpenStack storage summary
OpenStack provides a continually maturing set of features that meet the needs of block, file and object storage. Although the options seem complex, wide supplier support means OpenStack can easily be integrated into an existing storage architecture, allowing consistency for storage management in enterprise deployments.
This was first published in June 2014