BillionPhotos.com - stock.adobe.

Storage implications of a modern IT architecture

One of the challenges of migrating older applications to a cloud-native, modern IT architecture is how to provide persistent storage

IT leaders are deploying an increasing number of containers to modernise applications, run commercial off-the-shelf software, and handle artificial intelligence (AI) and advanced analytics workloads.

While containerisation is associated with a cloud-native microservices architecture, it is also used to host traditional enterprise applications, such as when organisations want to reduce their VMware estate and select Kubernetes to support both cloud-native and traditional applications. As a consequence, more organisations are having to work out how best to handle persistent storage with Kubernetes (see Acid versus Base box).

A Dimensional Research poll of 519 IT decision-makers at organisations with 500 or more employees for the Portworx Voice of Kubernetes report 2026 found that 71% of organisations plan to modernise and/or migrate their virtual machines to Kubernetes.

Pods and stateful applications

Containerisation started out as an approach to developing modern, cloud-native applications that were “stateless”. However, the Portworx report stated that 58% of IT decision-makers plan to run stateful applications such as databases on Kubernetes. 

While Linux containers – known as pods in the Kubernetes world – can directly access storage, the disk files in a container have a short lifespan and are deleted when the container goes down.

Looking at how Docker containers used to be deployed, analyst firm Gartner’s How to deploy and operate containers and Kubernetes report notes that these containers did not read information, apart from the parameters provided in their configuration files, and they did not store their state outside the operational scope or lifetime of a single container instance.

This has ramifications for IT teams tasked with reengineering enterprise applications to make them cloud native, where they are containerised and managed through Kubernetes.

Gartner is now seeing interest in using containers to deploy stateful applications, which means data can be accessed beyond the lifetime of a container. Stateful containers are also protected with backup and restore processes, and as Gartner notes, data may be shared by multiple containers.

Containerisation of edge compute

One reason that more attention is being placed on persistent storage now is the data storage requirements of edge computing, which may need to cope with intermittent network connectivity or operate standalone.

In a blog published in 2022, Kevin Pitstick and Jacob Ratzlaff, software engineers at the Carnegie Mellon Institute, wrote about the benefits of using containerisation at the edge.

Pitstick and Ratzlaff wrote that containerisation enables recovery and continued operation under fault conditions. If a containerised edge application is engineered using a microservice architecture, another benefit of containerisation is that if a container application crashes, only a single capability goes down rather than the whole system.

They noted that Kubernetes can take advantage of virtual IP addresses to allow failover to a backup container if the primary container goes down, and orchestration can be used to redeploy containers automatically for long-term stability.

Acid versus Base

A typical enterprise database application relies on Acid (atomicity, consistency, isolation and durability) to ensure reads and writes to the data store are not compromised – a database transaction either completes or fails. This is a very different objective to a containerised application. Kubernetes is regarded as a Base system (basically available, soft state, eventual consistency), which is focused on being able to restart reliably in the event of a failure.

The approach taken by Kubernetes can lead to data corruption and data inconsistency unless steps are taken to prevent multiple pods from writing data to the storage device. It is a problem well-understood in the world of distributed computing, but the additional steps needed to ensure data is not corrupted adds latency. This is why Kubernetes is offering tunable persistent storage, according to Michael Azoff, chief analyst at Omdia.

“You need that flexibility to support containerised microservices and enterprise transactional applications. In the old days, it was Acid consistency, and then they had a more flexible Base. But the modern storage systems have tunability between Acid and Base, which provides persistent storage flexibility,” he says.

Azoff says this is becoming more significant as more artificial intelligence (AI) workloads are run using Kubernetes. He says some of the new developments supporting persistent storage in Kubernetes can be tuned up and down to provide the maximum throughput and minimum lag when needed for running AI workloads.

According to Pitnick and Ratzlaff, containers can be easily spread across multiple edge systems to increase the chance that operations will continue if one of the systems gets disconnected or destroyed.

China’s first orbit interconnected space computing constellation, led by Zhijiang Laboratory, uses CubeFS, a Cloud Native Computing Forum (CNCF) project classified as “graduated”, as the storage base of its space-based distributed operating system to manage data distributed across multiple satellites. CubeFS was originally developed by Chinese e-commerce giant JD.com. Along with the resilience needed in software that operates in space, CubeFS has been optimised to handle large volumes of small files efficiently. 

Such functionality is important for scaling systems, something that is gaining more attention as IT decision-makers consider the IT infrastructure requirements for artificial intelligence (AI).

Approaches to persistent storage

In a 2024 survey of 372 organisations, the CNCF found that organisations with only some experience of cloud-native computing and containerisation were more likely to find storage a challenge (15%) than those that consider themselves mature cloud-native organisations (13%). Interestingly, the organisations with the least concern about storage were those that had not yet started their cloud-native journey (11%).

While it is possible to deploy stateful applications that require persistent storage using public cloud services like Amazon Simple Storage Service (Amazon S3), such services are generally less flexible than open source Kubernetes persistent storage. 

Prior to Portworx, we weren’t able to support our database workloads on Kubernetes. Now that we have Portworx at the core of our infrastructure, we provide our clients with the ultimate flexibility and speed for their most demanding persistent storage needs
Jeroen van Gemert, KPN

At the KubeCon conference in Amsterdam in March, SuSE discussed an update to its Longhorn persistent storage project to support high-performance capabilities needed for AI. Roy Illsley, principal analyst at Omdia – who attended the conference and was also at the Nvidia GTC event in San Jose – says cloud-native persistent storage is particularly relevant for AI workloads.

“It’s probably becoming more relevant as people are realising that if we want to do AI and machine learning, and we want to use containers or cloud-native technology, then we’re going to need persistent storage,” says Illsley.

SuSE sees persistent storage as key to the successful modernisation of applications that can take advantage of containers and orchestration. It positions Longhorn as a product to help IT infrastructure teams address performance problems, such as databases slowing down and applications experiencing an increase in latency. According to SuSE, IT tends to address such performance bottlenecks by over-provisioning high-speed storage. Longhorn is classified as an incubating project by CNCF, which means it is stable enough for production environments.

SuSE says its Longhorn V2 Data Engine avoids using the Linux kernel storage stack, and has been engineered to avoid the context switching overhead inherent when traditional storage pipelines are used. SuSE says the new architecture allows storage operations to run on dedicated central processing unit (CPU) cores to unlock higher throughput and lower latency.

It claims that early benchmarking has shown that Longhorn V2 achieves a two to four times improvement in write performance and up to two to three times faster random read performance compared with the previous version. Along with other improvements, SuSE said the new version offers 10 times the performance of Longhorn V1 in certain scenarios.

Portworx, which was acquired by Pure Storage in 2020 and is now known as Portworx by Everpure, is a commercial alternative to SuSE Longhorn, built on open source technologies. At KubeCon, it unveiled new capabilities for Portworx Enterprise to support organisations running enterprise-level virtual machines (VMs) and containers in cloud, hybrid and on-premise environments. The company says these updates allow organisations to embrace modern virtualisation, thereby accelerating provisioning, simplifying automation and saving time.

Dutch telco KPN has been using Portworx for a number of years to support software that requires persistent storage when it is containerised, as Jeroen van Gemert, DevOps engineer at KPN, explains: “Prior to Portworx, we weren’t able to support our database workloads on Kubernetes. Now that we have Portworx at the core of our infrastructure, we provide our clients with the ultimate flexibility and speed for their most demanding persistent storage needs, from databases like Redis and ElasticSearch to other high IO [input/output] workloads.” 

Persistent storage for digital sovereignty

Along with containerisation of edge computing applications, AI and the migration of workloads away from VMware, the rise in demand for data sovereignty means IT leaders are asking for stringent control over where their enterprise data is stored.

SAP, for instance, is building a digitally sovereign 120PB (petabytes) storage backbone for Europe, based on the open source Ceph software-defined distributed storage platform and Rook, a storage orchestrator for Kubernetes, which provides a management layer. Rook is another CNCF project. Its CNCF classification is graduated, making it suitable for production environments.

As the containerisation of stateful applications becomes more common, Gartner recommends IT decision-makers consider a variety of approaches for providing data persistence to containerised applications. While providing persistent data storage for applications that run on a singular host is straightforward, the analyst firm notes that with applications clustered across multiple hosts for the purpose of high availability or scalability, orchestrating the state of this data can be more challenging.

Read more about persistent storage

  • Cloud storage security best practices: Protecting stored cloud data requires meticulous planning and a well-crafted strategy. Here are the steps to take to keep your organisation’s data safe and secure in the cloud.
  • 12 ways to manage your data storage strategy: Data storage systems were never easy to manage, and with spiralling capacities, it’s become even harder. Try these 12 technologies and practices to help ease the storage management burden.

Read more on Computer storage hardware