Storage 101: Object storage vs block vs file

We recap the key attributes of file and block storage access and the pros and cons of object storage, a method that offers key benefits but also drawbacks compared with SAN and NAS

Antony Adshead, Computer Weekly

Published: 01 Mar 2017

The emergence of object storage as a viable means of data retention upsets the existing methods – closely connected – of file and block storage, also known as NAS and SAN.

This article will recap the fundamentals of file and block, but with the purpose of highlighting the quite different characteristics of object storage, all of which are forms of shared storage. In the final analysis, we will suggest the use cases most suited to object storage, as well as file and block.

The trigger is the rise of object storage, which has become prominent in the form of array-type products as well as being the basis for cloud-based protocols such as Amazon’s S3.

To see how object storage differs significantly from SAN and NAS protocols, let’s first look at those.

File and block are file system-based methods of storage access.

In both cases, there is a file system. We are all familiar with them – FAT and NTFS in Windows, ext in Linux, and so on. They organise data into files and folders in a tree-like hierarchy and give a path to the file while also retaining a small amount of metadata about the file.

That is the part we see. But under the bonnet, that file path and the file system also handle addressing to the physical location of blocks of storage on the media itself.

The key difference between file access/NAS and block access/SAN is that in NAS, the file system resides on the array. Here, an application’s I/O requests go via the file system resident on the NAS hardware, accessed as a volume or drive. In a SAN, the file system is external to the array and I/O calls are handled by the file system on the server, with only block-level information required to access data from the SAN.

Key practical difference

From that distinction arises the key practical difference between NAS and SAN.

NAS is best suited to retention and access of entire files and has locking systems that prevent simultaneous changes and corruption to files.

Meanwhile, SAN systems allow changes to blocks within entire files and so are extremely well suited to database and transactional processing.

Both usually come as array products, even if software-defined, and – depending on how high-end or not – with features such as synchronous and asynchronous replication, snapshots, compression and deduplication, and storage tiering. Both can also take advantage of flash storage.

SAN and NAS are well suited to what they do, but have drawbacks.

For example, NAS can be limited by scale. Historically, organisations put in a NAS box to service a department, but these proliferated and were unconnected, leading to silos of data. This issue is overcome with scale-out NAS, where multiple NAS instances operate a single, highly-scalable parallel file system.

The tree-like file system hierarchy can handle millions of files quite easily, but once you scale to billions, it can start to slow up.

Massive scalability

Object storage brings massive scalability. That is because it works differently from the SAN and NAS protocols. It has no file system but, like NAS, changes are at the file level.

Instead of a tree-like hierarchy, object storage organises files, or objects, in a flat layout. Objects are just objects, with unique identifiers.

That means object storage is massively scalable, to billions of objects, because the file organisation does not become unwieldy the bigger it becomes.

Objects also have metadata, and lots of it, potentially, all definable by the customer. That means any attribute can be associated with an object in its header metadata: the application it is associated with, its data protection characteristics, tiering information, when it should be deleted, and by custom business- or organisation-related attributes.

So, object storage is eminently suited to analytics, being searchable in very large datasets for potentially almost any attribute.

Data protection is usually by erasure coding, sometimes by replication, although the former is considered more efficient than the latter because it produces less overhead data.

Almost always, however, object storage data is “eventually consistent”, which means the multiple instances required for data protection schemes to work are not instantaneous or anywhere near. They will eventually be consistent with each other as erasure coding/replication works its way between locations.

Storage 101: Object storage vs block vs file

We recap the key attributes of file and block storage access and the pros and cons of object storage, a method that offers key benefits but also drawbacks compared with SAN and NAS

Key practical difference

Massive scalability

Read more on object storage

Read more on Computer storage hardware

SAN vs NAS: For AI, virtual machines and containers

AI storage: NAS vs SAN vs object for training and inference

SAN vs. NAS vs. DAS: Key differences and best use cases

Storage technology explained: File, block and object storage