grandeduc - Fotolia

Container storage 101: What is CSI and how does it work?

We look at the container storage interface, which provides an interface to persistent storage in the products of storage array makers, and how it relates to Kubernetes

Containers are sweeping the datacentre. They’re a lightweight method of virtualising applications and allow for rapid scaling and contain all that’s needed to run processes in a multitude of environments with few dependencies.

But they need storage. And while containers were originally conceived of to have storage as stateless as themselves, it soon became apparent that containerised applications needed to retain data for longer.

So, a variety of ways of achieving persistent storage for containers – largely represented by Docker and the container orchestrator Kubernetes, though there are others available – have been developed.

First, let’s recap on the basics of Kubernetes storage and its key methods of defining and calling on storage.

We have previously looked int detail at persistent volumes, which define available storage volumes by a variety of parameters that include performance and capacity; storage class, which groups persistent volumes available and the method of connection to Kubernetes; and persistent volume claims, which are the requirements of the developer or application and which are bound to persistent volumes as required.

Then we looked at Rook, which runs in the Kubernetes cluster and exposes and orchestrates persistent storage across a range of storage types. Rook is essentially software-defined storage, built within Kubernetes and using a containerised architecture to provide the storage in something like a hyper-converged or hyper-scale infrastructure.

Rook isn’t aimed at using mainstream storage arrays as capacity – they come with controller capability built in, and it’s not the aim of Rook to provide that. Instead, it focuses on managing capacity from a range of storage types that span Ceph file, block and object, some parallel and NAS file systems and some database data services.

In this article, we’ll drill down into the container storage interface (CSI) drivers that allow storage makers to expose their products to Kubernetes as persistent storage.

CSI: Connects Kubernetes to 60+ storage products

CSI is the container storage interface. It is a plugin for Kubernetes and other container orchestrators that allows storage suppliers to expose their products to containerised applications as persistent storage.

At the time of writing, there are more than 60 CSIs available for a wide range of file, block and object storage in hardware and cloud formats.

CSI is essentially an interface between container workloads and third-party storage that supports the creation and configuration of persistent storage external to the orchestrator, its input/output (I/O) and advanced functionality such as snapshots and cloning.

CSI replaces plugins developed earlier in the Kubernetes evolution, such as in-tree volume plugins and FlexVolume plugins. Without going into detail, the advantage of CSI is that it has been designed to provide a simplified set of specifications to which storage suppliers can write their plugins, and that they are not dependent on the Kubernetes release cycle.

Orientation: CSI in PVs, storage class and PVCs

Once you have deployed a CSI to a Kubernetes cluster, it is available for use with persistent volumes (PV), storage classes and persistent volume claims (PVCs).

For example, you can create a storage class that points to external storage defined by a CSI plugin. Then, dynamic provisioning could be triggered by a PVC that specified a storage class. Subsequently, when that claim invokes the creation of a volume, it is carried out by the external storage via the CSI and that PV is bound to the PVC. Alternatively, it is also possible to expose a pre-existing volume via a PV.

CSI gained general availability (GA) status at version 1.13 of Kubernetes, which was at 1.17 at the time of writing.

Beta functionality likely to make it to future GA versions include the ability to expose raw block storage to the container, awareness of where storage is provisioned in terms of cloud zone and region, and volume snapshots.

More information on Kubernetes’ overview of CSI can be found here.

Read more about containerisation and storage

Read more on Containers

Data Center
Data Management