Object storage technology was developed to resolve problems created by traditional hierarchical file systems when managing web-scale storage. In particular, the difficulty that very large file systems with deep hierarchies become unwieldy when accessing one file from a set consisting of billions.
Object storage implementations vary and there is no single recognised definition, but they do share several broad attributes.
Instead of storing data that includes payload and contextual metadata as files, object storage represents data as objects, each addressed by a unique identifier and stored in a flat address space with no sub-directories. Retrieval is via identifiers contained in an indexed database and assembled at a higher level into files.
To assess what is available on the market today, let's look at object storage technology produced by some of the smaller storage vendors - those from outside the big six storage suppliers.
The product sets share common attributes and target markets, namely customers that need to address very large volumes of data, in use cases such as active archiving, public cloud, high-performance computing (HPC) and big data.
All the object storage systems assessed below are distributed and use erasure coding rather than Raid for resilience, while most offer access via traditional file protocols that are translated into object calls at a lower layer.
One company, Exablox, aims at the mid-market and offers higher-level features such as data deduplication and encryption, and simpler configuration in a bid to reduce administrative complexity.
NEC also offers encryption. Most offer systems as a hardware appliance, two produce software only, and one, DDN, offers software integrated with hardware from specific third parties.
Read more on object storage
Amplidata’s Amplistor is designed for customers that need to access billions of objects with storage capacities of exabytes and beyond in enterprise systems and private and public clouds.
Amplistor uses erasure coding for resilience, and Amplidata claims "15-nines" durability. The system consists of a 44U rack with two 40-port Ethernet switches and up to 468 drives in 39 storage nodes.
The nodes use SATA drives and consist of the AS36 with a capacity of 36TB or the AS48 with 48TB, connected via a pair of 1GbE ports. The whole is managed by three controller nodes, each with a pair of 1GbE ports.
Scalability is achieved by adding more storage nodes, while scaling throughput is gained by adding controller nodes. Throughput is claimed to be 1Gbps per controller. Amplistor says its software-defined storage technology means performance scales linearly and is effectively infinite.
Caringo offers its Swarm object storage software stack – previously named CAStor – that boots from bare metal and runs on user-provided commodity hardware from multiple vendors with a variety of specifications. Archiving, big data, compliance and cloud storage are Caringo's target markets – those that need to store very large volumes of unstructured data.
Swarm consists of a massively parallel architecture, using a redundant array of independent controller nodes that run entirely in RAM, with each node able to perform every process. Data resilience is provided by erasure coding.
Features include optional Worm capability, the ability to scale dynamically by adding a new node to the cluster, and an energy-saving, adaptive power conservation technology that spins disks down and reduces CPU utilisation.
CleverSafe's object storage product portfolio is aimed at big data, archiving, cloud and multi-tenant environments.
The product set consists of three types of appliance: The dsNet Manager that performs management operations, the largest of which can handle up to 100PB; Accesser appliances that front-end requests from applications up to throughput of up to 750Gbps, and; Slicestor storage nodes.
Slicestors come in three sizes, the 1U 4100 with 3.8TB, the 2U 2212 that provides 48TB and the 1440 that fields 192TB from a 4U box. All house SATA disks, and include a pair of 1GbE ports as standard, with options to add two 10GbE ports.
Access is via a REST API and Amazon S3, but for traditional file protocols such as NFS and CIFS you need a third-party gateway. The system is inherently multi-site, with data slices stored in multiple locations to aid resilience.
CleverSafe's design addresses scalability issues by adding more storage nodes and is, the company claims, limitless. It also claims its systems are 100 million times more reliable than RAID, and that it costs up to 90% less than traditional storage.
DDN's Web Object Scaler (WOS) is software designed to manage a range of types of unstructured distributed data at web-scale volumes.
WOS runs on DDN's WOS High Performance or Archive appliances, and is claimed to be ready to run on Hyve Solutions' OCP-ready Ambient Servers and CTERA's Cloud Storage Platform.
It can also run on customer-procured hardware. DDN has collaborated with Tiger Technology to create automated data lifecycle management and collaboration, and with ASG Software Solutions to create a high-volume archiving and data movement system.
The High Performance node comes loaded with WOS software and consists of a 4U box with a pair of redundant controllers that provide system management and host connectivity. There are slots for 60 disks that can be flash, SAS or SATA intermixed. Access is via four 10GbE ports or a 40GbE port, with up to 2.4GBps throughput. The 4U Archive node also provides 60 slots for SAS disks only, with access via one 10GbE or one 20GbE port.
WOS scales out in clusters of up to 256 nodes, and clusters can be combined to create a distributed namespace up to one exabyte in size containing 32 trillion objects, accessed using RESTful APIs and via file system gateways using NFS, CIFS and IBM's GPFS.
Exablox offers mid-market customers a scale-out, object-based storage appliance, the OneBlox, managed by OneSystem, the company's multi-tenant, cloud-based management software.
Each 2U appliance provides eight drive slots, and will accept any drive type, of any size, for a maximum of 32TB. Customers provide their own drives, says Exablox. The boxes are deployable in a ring architecture to provide a single file system. A ring can consist of one or more boxes scaling to 288TB. Multiple rings can be connected to allow mirroring or replication.
Exablox says the system is easy to set up and to scale by adding more disks to a box or adding another box to the ring using a Bluetooth-like pairing process. Object storage is managed by Exablox's custom-designed file system.
Unusually, it also offers higher-level features such as inline data deduplication, replication and snapshots, and AES256 encryption. Management is web-based and the file system is accessible using CIFS, with NFS on the roadmap, using the four 1GbE ports.
NEC’s Hydrastor is a cluster of purpose-built storage and accelerator nodes. The former are used to expand capacity, the latter to expand capacity and throughput. It uses erasure coding and provides up to 12TB of capacity per node and can scale to 165 nodes.
It is a scale-out grid storage platform with in-line global data deduplication supporting capacities of more than 100PB and claimed throughput of more than 4PB/hr (1TBps). The system supports a maximum capacity of 256PB per file system, and access is over CIFS, NFS and Symantec's OST.
Resilience is provided by erasure coding. It can tolerate up to six drive or node failures without reduction in I/O. AES128 or AES 256 encryption for data at rest are standard, while WAN-optimised replication and in-flight encryption are available as options, along with WORM functionality.
NEC's HS8 series is aimed at large enterprises, and aims to be a scalable backup solution for long-term data. Maximum raw capacity is 7.9PB, rising to an effective capacity of 103PB after data deduplication and backup. It offers six 1GbE, or up to four 10GbE plus two 1GbE ports.
The HS3 series is designed for the SME and remote office deployments. It offers similar features to the HS8 series but with throughput of up to 19.8TB/hr (or 5.5GBps) and maximum effective capacity 312TB, 24TB raw. Access is via six 1GbE, or two 10GbE plus four 1GbE, or four 10GbE plus two 1GbE ports.
Scality Ring is software that runs on any commodity hardware and manages large volumes of unstructured data such as emails and media files. It is aimed at service providers, video broadcasters, Web 2.0 portals, HPC applications, and those managing large-scale active archives.
The storage media can consist of anything from flash to SATA drives, made resilient by erasure coding technology. System design is based on distributed nodes, which can be added and removed without data loss, allowing scalability, with data balanced across nodes for further resilience and performance enhancement. The system scales to exabytes, according to Scality, and is claimed to offer higher performance than other object stores.
This was first published in July 2014