Eric Hood - stock.adobe.com

Rook 101: Building software-defined containerised storage in Kubernetes

Rook – think castles, not birds – uses the principles of containerisation and the methods used in Kubernetes to build storage that’s abstracted from the hardware it lives on

Containerisation is sweeping IT, and it’s the abstraction that’s the attraction. In other words, use of containers decouples the application from the infrastructure needed to run it.

Of course, software needs hardware, but with containers – via a container orchestrator such as Kubernetes – you can build a dynamic environment in which applications can run on-premise or in the cloud. The containers and their orchestrator carry everything needed to create, manage, scale and take applications out of commission rapidly using automation.

We’ve looked elsewhere at the basics of storage in Kubernetes. Here, we’re going to look at another way of doing things, using Rook, which offers the ability to create pools of storage from within the Kubernetes cluster.

Rook – which is at version 1.2 at the time of writing – runs in the Kubernetes cluster and exposes and orchestrates persistent storage across a range of storage types.

Rook doesn’t aim to play with the mainstream storage hardware makers’ arrays or cloud providers’ storage. It could, in theory, use any array capacity as storage, but that’s not the real point of it. There would be nothing to gain by spending money on a storage array where you’re paying for the value added by the controller, when Rook provides that functionality.

Instead, Rook is essentially a software-defined containerised storage solution that aims at providing hyper-converged – all in the same node – or hyper-scale storage.

Rook was originally created as a way of containerising and managing Ceph, the open source block, file and object software-defined storage from Red Hat (acquired by IBM in 2019). But it can also be used to containerise storage types that include EdgeFS, which is a global file system based on object storage but with file, block and Amazon Simple Storage Service (S3) access methods.

Network file system (NFS) storage server access is also possible, as is use of the S3-compatible MinIO object storage server.

Rook also supports data services access, including the Apache Cassandra NoSQL database and the CockroachDB and YugabyteDB distributed cloud-native SQL databases.

Rook is now under the auspices of the CNCF and is classified as an “incubating project”.

Rook puts the selected storage (grouped by storage class and referencing PVCs) into containers and provides cluster management for it, automating tasks such as scheduling, deployment, bootstrapping, configuration, scaling, load balancing, disaster recovery, and so on.

Key to Rook operations is a Kubernetes operator that monitors resources to make sure storage is running to the requirements of the storage class and acts to boot, heal, clone and maintain storage to ensure that.

Starting Rook in a cluster can begin with a few kubectl commands, depending on the storage provider.

Rook’s potential benefits

Rook is essentially software-defined storage, which means developers can make storage resources programmable. In short, it allows the creation of pools of storage from a range of storage types and, in theory, allows that use of storage resources to be portable across a number of on-site and cloud locations – although support for storage types and suppliers isn’t anywhere near universal yet.

Anyone who is seriously engaged with containerisation will see the benefits of containerising potentially disparate storage instances into software-defined pools using Rook.

Inherent in that is the ability to scale storage horizontally and vertically with faster and automatic provisioning of volumes for pods, automated healing from failed or corrupted disks, rapid automated deployment and better utilisation of resources.

Possible drawbacks and limitations

There are no limitations in a workload sense, in that wherever containers can be used, Rook is, in theory, a good fit. If you’re happy to containerise the processing of your data, then managing its storage should be fine for you too.

The biggest handicap currently for potential Rook users is that it’s still early days. Containerisation is not yet ubiquitous, and neither is the understanding of how and whether organisations should use it.

And Rook itself is in its early days, with some production users but an incomplete list of supported storage types.

Nevertheless, it’s an idea that makes perfect sense, so we will watch this space with interest.

Read more about containerisation and storage

Read more on Virtualisation software