McCarony - Fotolia
Open source software maker Red Hat has announced some enhancements to its Gluster file system that include improved metadata caching to speed searches and deeper integration with the Red Hat container platform OpenShift.
The upgrade to Gluster 3.2 has also added so-called Arbiter Volumes, in which inconsistency within file system clusters can be resolved with two full copies and one copy of metadata rather than three full copies, as well as faster self-healing of erasure coded volumes.
Gluster is a NAS, file access parallel file system that can scale out across multiple nodes with a single namespace that runs to millions of files. Data protection can be by 3x replication or erasure coding.
Parallel file systems like Gluster are best suited to unstructured file data held in large quantities. They face increasing competition from object storage as a method of data retention for such workloads.
Improved metadata operations in Gluster 3.2 are the result of client-side caching that mean directory operations, finds, and so on do not have to go to the server, and are claimed to be up to 8x faster than in previous versions.
The Red Hat Gluster improvements to integration with OpenShift allow customers to deploy Gluster instances from within containers and connect them as a cluster with geo-replication and to support applications on hosts that OpenShift is running on.
This adds to the deployment options for Gluster, adding containers to physical, virtual machine and cloud as options to deploy file system nodes.
Read more about containers and storage
The introduction of Arbiter Volumes means that when using three-way replication, instead of having to have three full copies to resolve inconsistencies between instances, such as split brain situations, Gluster can match clusters with two full copies and a copy of the metadata. The key benefit is that storage capacity and hardware requirements are reduced.
Red Hat storage product head Sayan Saha said: “Usually, with three-way replication, you can expect a 3x overhead. But with Arbiter Volumes, you can operate with a copy of the metadata, but not the data. So, in cases where there are nine nodes, you can solve split-brain problems with six or seven nodes, with the Arbiter Volume residing on a separate node or in one of the others.”
Faster self-healing of erasure coded volume has been built into Gluster to allow quicker recovery from data loss or corruption. Erasure coding is a form of data protection in which additional data is created that can help rebuild a volume in the case of data loss. Gluster 3.2 has parallelised the processes involved in its self-healing operations to allow multiple threads where there was previously only one.