I spoke with French startup Rozo this week.
It’s USP is that it marries the scalability of scale-out NAS – ie virtual unlimited and with potentially billions of file names – with erasure coding and targets high performance computing (HPC) use cases and analytics, especially with large volumes of large unstructured files.
There seems to be a rumbling battle going on between scale-out NAS and object storage for use cases that involve lots of unstructured data. Both approaches have the ability to deal with very large amounts of files and can usually grow performance and capacity independently.
But usually erasure coding as a data protection method is the preserve of object storage. In this case Rozo applies it to scale-out NAS.
According to Philippe Nicolas, systems advisor with Rozo, erasure coding allows Rozo to sustain high levels of I/O for all data even during disk failures. At the same time it allows clusters to be built more cheaply than most scale-out NAS, which uses replication as its key data protection method.
He said: “Replication is very good for small files. It’s fast and there’s no coding, just a copy. But with large files replication takes time and there’s a CPU impact.”
That’s because replication requires the customer to provide at least double the storage capacity for those replicated copies. Erasure coding, at least in the Mojette Transform (Mojette is a famous bean in the Nantes area) algorithm, requires only a 1.5x capacity overhead.
According to Nicolas, Rozo is the only scale-out NAS supplier to offer erasure coding for all data. EMC’s Isilon uses it only for data above 128Kb.
Customers can build Rozo clusters by installing the software on commodity servers. Storage comes from internal server storage or can be on a JBOD or even the user’s existing shared storage array.
The Mojette Transform algorithm encodes all data upon ingest but also maintains metadata for rapid index-related functionality.
Rozo can be downloaded and tried for free in its community edition without optimised erasure coding, while the Advanced commercial version includes it.
Rozo’s target markets are high performance computing (HPC) media and entertainment, genomics and specifically, said Nicolas, use cases where you find the Lustre parallel file system, Qumulo scale-out NAS or EMC’s Isilon.