bdavid32 -

How open source is shaping data storage management

Open source data storage offers a great deal of flexibility, but unlocking its benefits will require strong technical resources to meet requirements such as stability, high availability and security

This article can also be found in the Premium Editorial Download: CW Asia-Pacific: CW APAC: Trend Watch: Storage management

In 2016, a group of companies banded together to address a key storage management challenge that they and their customers were facing – managing a heterogenous storage footprint that was hampering the deployment of storage and data services.

At the time, the group, comprising Dell EMC, Fujitsu, Hitachi, Huawei, Intel and Vodafone, formed the OpenSDS (open software-defined storage) project, an open source project incubated under the Linux Foundation to build a community to address those issues in a generic and standardised way.

“No customer uses a single vendor for storage – they wanted something like a platform or framework to connect to different kinds of storage and do monitoring and deployment,” says Steven Tan, chairman of the Soda Foundation, which was formed in June 2020 to expand the scope of the OpenSDS project.

While OpenSDS has paved the way for virtualised storage that pools multiple storage systems together, the Soda Foundation takes things further by fostering an ecosystem of open source data management tools and capabilities, from the edge to the cloud.

Soda, a recursive acronym that stands for Soda Open Data Autonomy, is made up of seven core projects focused on delivering capabilities such as infrastructure management, multicloud data management and application programming interfaces (APIs), among others.

“I think the greatest part about OpenSDS and the Soda Foundation is to bring everybody together to come up with a solution,” Tan told Computer Weekly, noting that Soda is a loose framework that is flexible enough for any user or vendor to adapt or extend to meet their needs.

Indeed, end-user organisations and major open source software and storage vendors are jumping on the bandwagon – whether they see their work as being part of the Soda framework or not.

Longhorn project

Rancher Labs, for example, originally developed the open-source Longhorn project to provide a cloud-native distributed storage platform for Kubernetes.

“It works with any Kubernetes distribution and makes the deployment of highly available persistent block storage in your Kubernetes environment easy, fast and reliable across x86 and ARM64 architectures in the datacentre, in the public cloud and at the edge,” says Vishal Ghariwala, SUSE’s chief technology officer (CTO) for Asia-Pacific and Japan and Greater China.

SUSE, which acquired Rancher Labs in 2020, contributes actively to the Longhorn storage project, which is being advanced as a sandbox project under the Cloud Native Computing Foundation (CNCF). SUSE also contributes to the Ceph project designed to provide scalable object-, block- and file-based storage under a unified system.

Open source has also attracted the attention of data storage suppliers such as NetApp. The company already contributes to the Soda Foundation’s open data framework, which includes KubeEdge integration and file support for NetApp’s Ontap data management software.

At the edge, enterprise IT giant Dell is a contributor to StarlingX, a cloud infrastructure software stack for the edge used by demanding applications in industrial internet of things, telecoms, video delivery, and other ultra-low-latency use cases.

In 2019, it contributed prototype code to the Linux Foundation to seed Project Alvarium, which delivers data from devices to applications with measurable confidence and trust. It has also engineered Project Nautilus, a real-time analytics and streaming storage solution built from the ground up to provide the foundation for reliable streaming applications.

Benefits of open source storage

Matthew Hurford, vice-president for solutions engineering and field CTO of NetApp in Asia-Pacific, notes that a key benefit of open source storage is access to innovation.

“The open source software community draws a huge pool of technologically diverse talents globally,” he says. “Motivated to solve challenges, these talents can contribute to existing open source codes. This collaborative nature of open source communities results in a virtuous cycle whereby the resultant software created improves over time through the collective revisions of various contributors.”

For example, Apache Spark has more than 2,000 developers and over 3,000 commits annually. It would have taken almost 270 years to develop Spark outside an open source framework. Linux had more than 23,000 developers and 75,000 commits in the last 12 months alone.

“At NetApp, we will continue to contribute to open source communities and projects, such as CNCF (Kubernetes, Helm and Istio), the Linux kernel, and many others,” says Hurford.

Read more about storage in APAC

Another benefit is shorter time to market. Business processes can be more agile with effective automation and management of data. Modifications can be made to products promptly and improve deployment speed.

Continuous integration and continuous deployment are examples of this,” says Hurford. “In addition, open source can be customised to organisational needs. Due to its modular nature, vendors can adjust the code easily. This enables open source to function as comprehensive as proprietary software at any layer of the enterprise stack.”

Open source also provides a great amount of agility and flexibility, says Ghariwala. For example, when defining a storage and data management software architecture, organisations have the flexibility to pick open source technologies from multiple suppliers, instead of being locked into a single vendor.

“You also have the agility to change to another open source vendor providing similar capabilities, which could be due to cost factors, technology factors or even business factors, such as an open source vendor being acquired by another larger vendor, which may impact how a customer does business with the original vendor,” he says.

“With proprietary solutions, this level of agility and flexibility is generally available only when you are using solutions from that vendor or a few exclusive partners of theirs.”

Adoption considerations

While open source data storage software is cost-effective, there is a big difference between downloading a project for free and trying it out in a developer machine and using it to power mission-critical applications that have stringent requirements, such as stability, high availability and security.

Ghariwala says enterprises will need strong technical resources to architect a solution that supports their mission-critical application requirements, as well as dedicated resources to triage production issues. This can be very complex for most organisations.

The second challenge that enterprises may face relates to flexibility, which is not guaranteed when using open source technologies. Ghariwala says the problem generally arises when vendors support only their own technologies with their commercial open source solutions, creating lock-in and limiting an organisation’s ability to choose the right system for its needs.

Danny Elmarji, vice-president for presales at Dell Technologies in Asia-Pacific and Japan, notes that some Dell customers are starting to define and use their own software storage that runs on Dell’s hardware and compute, leveraging open source contributions.

Although these organisations have the talent and capabilities to support, design, build and sustain the lifecycle of their customised software-defined stack, they need to ensure that their software-defined storage layers are designed to integrate with the hardware platform, as well as manage their lifecycle and support experience.

Cyber resilience

Considering the sharp increase in cyber threats since 2020, Elmarji also urges organisations to consider the cyber resilience aspects as they adopt open source projects.

“The fundamentals of truly durable storage and data management products have not changed, and efficiency, performance, throughput and resilience are still considered top criteria when making procurement decisions,” he says.

To take advantage of an open source storage and data management software, Hurford says organisations will need to consider whether the software can support a hybrid and multicloud environment, integrate Kubernetes seamlessly, and enable scalability and agility with data storage and management.

On top of that, it will be crucial to ensure the data is protected and archived in a space-efficient manner.

“A comprehensive solution for cloud storage management gives users out-of-the-box usability and integration with cloud and on-premise resources,” says Hurford. “This will greatly increase the length of time it takes to put a file service into use, and limit the overhead involved in maintaining the solution.”

Read more on Storage management and strategy

Data Center
Data Management