
Ralf Gosch - stock.adobe.com
Container storage: Five key things you need to know
We look at container storage and backup, diving deep into how storage works in containers, container storage interface, container-native storage, and the management platforms storage suppliers offer
Most enterprises now run applications in containers, and so they must pay attention to how they store and manage data for containerised applications.
The Nutanix enterprise cloud index, compiled for the cloud software supplier by VansonBourne and published earlier this year, found 54% of firms have containerised all their applications and as many as 98% run at least one instance of Kubernetes.
This, however, poses a number of challenges for IT architects when it comes to data storage. Containers were designed to be ephemeral, or temporary, in nature. This works well enough for microservices. But mainstream enterprise applications need to process and retain data. This has required developers to adapt container technology to support persistent storage.
Containers bring a range of advantages to enterprises. Containerised applications run independently of the host operating system, making them highly portable. This helps businesses that run applications in hybrid or multicloud environments.
Containers are also “light”, demanding fewer resources, especially storage, than conventional virtual machines. Containers are more efficient, and spin up in seconds rather than minutes.
And, while not all containerised applications are microservices, the efficiency of containers lends itself to running them and allows the construction of complex applications out of small, reusable and efficient parts.
How do containers and storage connect?
The first generation of containers were designed to be stateless. This had advantages in speed of deployment and efficiency. But stateless, or impermanent, applications cannot store data beyond the lifetime of the container.
Stateless applications work in some situations, such as web services or microservices that do not need to store and access data on an ongoing basis. But that ability to handle data is central to many, if not most, enterprise applications.
As a result, container technology has adapted by adding persistent storage. Persistent storage sits outside the container, and can be on on-premise or cloud hardware, as file, block or object storage.
The container orchestration layer manages persistent storage. In the case of Kubernetes – the most common container orchestration system – data is stored in persistent volumes (PVs) and provisioned via persistent volume claims (PVCs) that are portable and can move with the container.
PVs are independent of any pod and are not portable across Kubernetes clusters. Both, however, serve to decouple the container and the storage, so that “conventional” storage works with containerised applications.
The challenge for IT teams, however, is that this is far from plug-and-play. The containerised applications, the orchestration layer and the storage all need to work together seamlessly to allow an enterprise application to work.
How does CSI help with data storage for containers?
To simplify and standardise how containers connect to storage, the industry has developed the container storage interface (CSI) and container-native storage.
CSI works with cloud, on-premise and hybrid storage, and across file, block and object storage. This allows developers to tailor their storage to their workloads.
CSI is a set of standards that allow storage suppliers to connect their technology to Kubernetes. Currently, there are more than 100 different CSI drivers available, for regular and software-defined storage.
CSI continues to evolve, adding support for more storage formats and more suppliers. One further advantage of CSI is that it helps IT teams consistently manage storage, even across multiple suppliers’ infrastructure.
What is container-native storage?
Container-native storage, for its part, is software-defined storage running inside the container, on Kubernetes. Container-native storage offers the prospect of only allocating storage to the container when the container needs it, making it more flexible than other forms of storage.
Suppliers offering container-native storage include Red Hat’s OpenShift Data Foundation (ODF), Pure’s Portworx and Nutanix’s Unified Storage.
According to industry analyst Gartner, 95% of global organisations will have containerised applications in production by 2029.
The cloud-native storage market, however, is less mature. Industry estimates put take-up for Portworx and Red Hat ODF, combined, at under 30% of the market, although analysts expect the market to more than double by the end of this decade. This suggests that, for now, enterprises are sticking with CSI.
How do storage suppliers support container storage and backup?
Suppliers are working to make container storage easier to manage and better able to work across a range of storage technologies. This is all the more important for enterprises that operate hybrid clouds. Some firms want to keep storage in-house or in private clouds, but still want to take advantage of cloud-native and containerised applications.
As a result, suppliers including Dell EMC, HPE, Hitachi Vantara, IBM, NetApp and Pure have each worked to improve support for containers. The hyperscale cloud providers have also continued to add to their container support.
These technologies are now fairly mature and, as such, should allow enterprises to run containerised applications with persistent storage without changing hardware or cloud storage provision.
On-premise and cloud providers have also added better support for backup and recovery of containerised environments. Robust backup is an essential feature for enterprise production deployments.
What container management products do suppliers offer?
The challenge remains, however, to strip out more of the complexity around persistent storage and containers.
Tools such as Pure’s Portworx and NetApp’s Trident aim to simplify provisioning of Kubernetes applications, as well as improve portability and protection.
NetApp’s Trident is open source, free of charge and uses CSI. It supports automatic provisioning for NetApp OnTap storage as PVs for Kubernetes. Trident also offers data management, data protection, and disaster recovery and business continuity for container environments.
Portworx also provides automated data services and policy-driven management. Portworx also uses CSI, and pools underlying storage into a single data fabric. This is then shared across clusters.
Pure points out that Portworx provides a consistent model for storage across cloud, hybrid and on-premise storage, with “cloud-like agility and responsiveness” for on-premise environments.
Pure recently integrated its Fusion intelligent control plane into Portworx, and added an artificial intelligence (AI) co-pilot, which it says can monitor Kubernetes clusters at scale. Portworx also integrates backup and recovery and automated capacity management into its platform.
These developments should all make it easier for developers to create containerised applications that need persistent storage, but also take away some of the overheads of storage and data management, as well as disaster recovery, once containerised software is running in production.
Read more about containers and storage
- Storage technology explained – Kubernetes, containers and persistent storage: In this guide, we look at the market-leading container platform Kubernetes, how it works, the challenges with persistent storage and backup, and how they have been overcome.
- Container storage challenges and how to overcome them: In this podcast, we talk to Pure Storage about the challenges of increased complexity, ‘technical debt’, skills needs of container deployments and why strategic thinking trumps tactical solutions.