This article is part of an Essential Guide, our editor-selected collection of our best articles, videos and other content on this topic. Explore more in this guide:
2. - Object storage tames data: Read more in this section
- The object storage system: Challenging file systems for unstructured data
- Object storage systems bring scalability, flexibility to unstructured data
- Using an object storage system to manage unstructured data
Explore other sections in this guide:
For IT professionals used to working with traditional file systems for their entire careers, object-based storage is like a whole new dimension they never knew existed. The concept of object storage can require a rewiring of the thought process concerning how data can be indexed and accessed in storage environments.
That’s because object-based storage systems throw out the existing file system approach. Forget hierarchies, which become ever more complex and non-performing as they grow. The object approach dispenses with these and instead uses a structurally flat data environment.
Object storage devices offer the ability to aggregate storage into disparate grid storage structures that undertake work traditionally performed by single subsystems while providing load distribution capabilities and resilience far in excess of that available in a traditional SAN environment.
In other words, object storage devices operate as modular units that can become components of a larger storage pool and can be aggregated across locations. The advantages of this are considerable as distributed storage nodes can provide options to increase data resilience and enable disaster recovery strategies.
To better understand the object structure, think of the objects as the cells in a beehive. Each cell is a self-contained repository for an object ID number, metadata, data attributes and the stored data itself. Each cell is also a separate object within the usable disk space pool.
Physically, object-based storage arrays are comprised of stackable, self-contained disk units like any other SAN. Unlike traditional storage arrays, object-based systems are accessed via HTTP. That access method plus their ability to scale to petabytes of data make object-based systems a good choice for public cloud storage. An entire object storage cluster of disparate nodes can be easily combined to become an online, scalable file repository.
Data retrieval in an object storage environment
So, how does data retrieval take place in the object storage environment? To retrieve data, the storage OS reads metadata and object ID numbers that are associated with data within the object storage environment.
When retrieving data, the associated object ID number and metadata is read and the data made available via the storage OS. This eliminates the need to delve into deep file structures, and intelligent caching can speed the process. The metadata also enables storage administrators to apply preservation, retention and deletion policies to their data.
Well-suited to static data
While object-based storage has advantages in scalability and rapid data retrieval over standard storage, the object approach lacks maturity when it comes to rapid I/O application serving. The object approach cannot rival block-based systems for the dynamic read/write speeds required by disk resource-intensive applications such as CRM databases. Object storage systems are optimised to serve static data -- for example, the digital archive of a museum.
This means object-based storage has a niche in the current storage market. It’s a good fit for use cases where static, hugely scalable archival storage is required and the use of SAN arrays designed to host virtual servers or highly dynamic application systems would be cost-prohibitive. It is also a viable alternative for data archiving to tape or optical disk; that media is slow, non-scalable and difficult to retrieve.
So far the penetration of object-based storage systems into the small and medium-sized enterprise (SME) market is limited. You’re no doubt aware of public cloud storage as an option for your organisation’s data needs, but you might not know that many of these services are now hosted in object storage environments.
Because object-based technology is now mostly used in large storage environments where data is primarily static and rapid application read/write is not on top of the requirements list, it has yet to become an immediate concern to storage administrators managing in-house arrays as they are primarily concerned with application performance.
Object-based storage products
Moves into object storage by traditional storage technology providers such as Dell and NetApp over the past year will certainly raise this profile of object storage.
Dell’s DX Object Storage Platform starts at 6 TB of raw space on a base-level system, so it could appeal to the larger small and medium-sized business (SMB) that needs to retain static data as near-line.
NetApp entered the object storage market with the acquisition of Bycast and its StorageGrid technology. StorageGrid is an enterprise-level cloud storage product that can be used with NetApp’s FAS and V-Series controllers in addition to providing native HTTP access to data. That combination means NetApp can deliver an object storage system suitable for all sizes of organisation, even those lacking enterprise-level budgets.
For larger implementations, the EMC Centera scalable object storage platform starts at 16 TB of raw capacity. Meanwhile, EMC’s Atmos is its flagship object storage platform aimed at cloud storage providers. It starts at 120 TB and scales to petabytes and billions of files across geographically dispersed nodes with automated, policy-based management and replication ensuring optimum access to content.
DataDirect Networks’ Web Object Scaler is an almost direct competitor to EMC Atmos; it can scale from two 7.2 TB nodes up to 6 PB and 200 billion files.
European competition comes from French vendor Scality with its Ring architecture, which can link together x86 servers as nodes in an object-based cloud -- which could be of interest to smaller companies. It also scales to terabytes and uses a distributed model for its object location tables (as does Atmos, but not DataDirect Networks).
For IT professionals working in environments with large archival and static data retrieval requirements, it has been a struggle to find cost-effective storage for this type of data. With object-based storage, there’s now a new range of options to seriously consider.