sommersby -

Storage 101: Object storage in the big three public clouds

We look at the object storage services of the big three cloud providers – Amazon S3, Microsoft Azure Blob and Google Cloud Storage – and how customers can achieve compatibility between them

Large chunks of public cloud storage are built around object storage. Block and file are dominant in the datacentre, but in the cloud, where the need is for large amounts of relatively cheap storage for unstructured data, object is king.

We’ve looked elsewhere at the pros and cons of object versus block and file storage. Here, we run the rule over the basics of the object storage environments used by the big three cloud providers.

S3 has emerged as a de facto standard. It originated from Amazon’s Simple Storage Service, and takes its name from that object storage platform, but has evolved as a more widely used means of addressing storage in the cloud and in on-premise object storage hardware from the likes of Scality.

But S3 is just one company’s language for cloud object storage. Microsoft has Azure Blob Storage and Google has its Cloud Storage too.

Each of these is similar, but they won’t directly talk to each other – so how do customers effect communication between the big three’s cloud object storage environments?

While it is entirely possible to write scripts to move and manage data between the three clouds, when it comes to readily available solutions from the cloud providers, the picture is not even.

It’s quite easy, for example, to find application programming interface (API) connectors to allow access to S3 storage from Azure and Google Cloud Storage. And Google makes a big play about migrating data from S3 to its own cloud, for obvious reasons. But there’s no obvious product or service to help access Google object storage from S3.

The market – and the relative strength of the suppliers – dictates what is available, to a large extent.

But the main purpose of this piece is to give an outline of the main object storage environments in the cloud, so here goes.

Cloud object storage in general

In any of the providers’ object storage schemes, an object can be any piece of data – a file, an image or some other kind of unstructured content – and is typically stored with metadata that identifies and describes the content.

It is also held in a flat structure, unlike the file structure hierarchy seen with network file system (NFS)-based storage, common internet file system (CIFS)-based storage or server message block (SMB)-based storage.

Where the providers mostly differ is in what they call things and how customers can access content in or move it between different clouds.

Amazon S3

In S3, objects can range from a few kilobytes up to 5TB in size, and objects are arranged into buckets that provide administration and multitenancy functionality.

S3 can be accessed using HTTP(S), REST-based APIs and a web browser console.

Commands comprise “Put” to store new objects, “Get” to retrieve objects and “Delete”. An update of an object is a Put request that overwrites an existing object or creates a new version of it if versioning is enabled.

In S3, objects can have a user-chosen unique name and commands. S3 can be accessed via software development kits for languages that include Java, .Net, PHP and Ruby.

Amazon S3 storage tiers include Standard, Standard (Infrequent Access) and Glacier, with different pricing and access times for each.

Microsoft Azure Blob

Azure Blob is the Microsoft equivalent to Amazon’s S3-based object storage services. Within that, a “blob” is like a bucket as the framework for retention of objects.

Objects in Blob (from binary large object, apparently) storage can be accessed via HTTP(S) URL and by users or client applications via the Azure Storage REST API, Azure PowerShell, Azure CLI, or an Azure Storage client library that are available for multiple languages that include .Net, Java, Node.js, Python, PHP and Ruby.

You can upload, download, list, move and carry out other commands in Azure Blobs using the shell and command line interface (CLI). Via the API, similar Get, Put, Delete, etc, commands exist as in S3.

Maximum object size is 4.7TB.

S3 users can access Azure storage via S3 commands by use of tools that include S3Proxy, which allows applications that use the S3 API to access Blob storage.

Meanwhile, (Scality’s) Zenko Connect for Azure provides an S3 API-compatible front-end translator to Azure Blob Storage. 

There is some interoperability between Azure and Google Cloud Storage, with the ability to copy data from the latter to the former using Azure Data Factory, for example. 

Google Cloud Platform

As in S3 and Azure, Google Cloud Storage provides storage for objects within buckets.

As with the other two platforms. there is a maximum size of 5TB for individual objects. There is an update limit of once per second on each object that means rapid writes to a single object won’t scale.

Objects are addressable via HTTP(S) URLs and Google has its own CLI, called gsutil, as well as APIs, and web graphical user interface (GUI) access.

Google Cloud Storage comes in four classes characterised by availability, access time and cost: Multi-Regional Storage, Regional Storage, Nearline Storage and Coldline Storage.

There is some level of interoperability between Google Cloud Storage and Amazon S3. Besides migration, you can, for example, manage and work with buckets in S3 from the gsutil command line. You can use Google’s Storage Transfer Service to select and manage objects in S3.

Read more about cloud storage

Read more on Computer storage hardware

Data Center
Data Management