phloxii - Fotolia

Cloud storage: Key storage specifications

We look at the key specs in cloud storage, including availability – such as five nines – bandwidth, IOPS and latency, capacity and tiering functionality, egress charges and security

Enterprise cloud storage is more flexible and offers more features and performance than ever.

Even among the three main cloud services – Google Cloud Platform, Amazon Web Services and Microsoft Azure – there is now a comprehensive range of capacity, availability, performance and security options.

However, this choice does not make the IT department’s task any easier. Buyers have to balance cost, performance, application compatibility and flexibility, while also comparing offers across the three suppliers.

But doing this well is vital to making the most of the cloud. Here we set out some of the key specifications when sourcing cloud storage.

Storage assumptions

Choice of storage architecture is usually driven by application support or the use case, and cloud providers now offer block or file storage formats as well as “pure” object storage.

Alternatively, for applications such as archiving or low-end personal storage, the cloud service provider will choose an effective underlying storage architecture, most often object. Dropbox, for example, recently moved to an object storage architecture.

Cost, too, remains a key consideration. For some applications, it is the most important factor when it comes to cloud storage – and cost and performance are intertwined.

Deciphering the cost structures for cloud services is a task in itself, with the need to account for bandwidth, storage capacity, egress fees, location and even application programming interface (API) calls. And with the large cloud service providers now offering performance tiers, there is a need to trade price and performance for any cloud storage buying decision.

Key storage criteria:

  • Availability

Firms that store data in the cloud need to know the availability the service provider offers, so they can compare with on-premise systems and the needs of the business. Not all applications need “telco-grade” or five-nines availability, however, and this can save cost.

Amazon’s S3 Standard storage offers service-level agreements (SLAs) with 99.99%, but some S3 services offer 99.9% and S3 One Zone-IA storage class aims for 99.5%. Azure offers 99.99% for its Azure NetApp Files via locally redundant storage.

Google Cloud ranges from 99.9% for its Coldline and Archive products to 99.99% for standard storage, or better for multi- and dual-region setups.

However, the actual metrics are rather more complicated than the SLAs suggest and need careful study. AWS claims 11 nines for some setups, for example.

  • Bandwidth and IOPS

Bandwidth, IOPS and latency all impact the performance of applications using cloud storage.

Bandwidth is governed by the cloud service provider’s offerings and the capacity of customer links to its datacentres and other systems.

GCP quotes a capacity of 5,000 object reads per second for its storage buckets, with a per-region limit of 50Gbps per project, per region when accessing data from a given multi-region. But GCP scales to 1Tbps if required. Amazon claims 50Gbps between EC2 and S3 in the same region. On Azure, a single Blob supports 500 requests per second, with block blob storage accounts able to achieve more.

For IOPS, AWS has options from 16,000 to 64,000 per volume via EBS. Azure Managed Disk reaches up to 160,000 IOPs and Azure Files up to 100,000 IOPS.

GCP’s persistent disk runs up to 100,000 IOPS read and its local SSD up to 2,400,000 IOPS read. On all platforms, write is generally slower.

As these data points suggest, and despite the importance of bandwidth and IOPS, it is hard to compare the cloud providers. Firms should look at the detailed requirements of their applications to find the best fit.

  • Capacity and tiering

On paper, the capacity of cloud storage is infinite. In practice, there are technical, practical and financial limits. Also, service providers offer storage tiers that help match capacity, performance and cost.

AWS can store data in an S3 bucket with objects in no fewer than seven tiers, from standard to deep archive storage. Intelligent tiering can do some of the heavy lifting, moving data between tiers, depending on use.

Azure provides hot, cool and archive tiers for blob data. Its hot tier has the highest storage but lowest access costs, with cold and archive tiers charging less for storage and more for access. Google offers four storage classes: standard, nearline, coldline and archive.

It is worth noting that as well as cost and latency differences, there are (minimum) storage time limits for tiers. Microsoft’s archive is 180 days minimum, Google’s is 365 days, and S3 ranges from 90 to 180 days.

When it comes to capacity, again it pays to look at the detail. S3 has no maximum bucket size or limit to the number of objects in a bucket, but the maximum bucket size is 5TB. Google has a 5TiB limit for an object. Azure states a maximum storage account limit of 5PiB by default.

Note, however, that CSPs can also have limits on availability zones and different limits for single and multi-region setups.

  • Retrieval and other (hidden) charges

Often the biggest complaints from firms that run cloud computing infrastructure centre on unexpected or hidden costs.

It is difficult to calculate the true cost of any consumption-based service because it means estimating anticipated demand and then trying to align product performance with that demand. In some cases, the benefits of the cloud will create more demand because it is easy to use and effective. Archiving is a good example of this.

Then there is the question of whether savings on on-premise systems really do materialise with a move to the cloud.

Read more on cloud storage

Nonetheless, cloud services have not always done a very good job at making their pricing transparent. A common source of complaint is egress or retrieval costs. Cloud storage can be very cheap, and sometimes even free. But service providers levy charges instead for moving data off their systems. These costs can be hard to predict and can catch customers out.

Cloud service providers are now much more transparent about retrieval charges and provide detailed advice for users on how to structure their storage.

Certainly, some past cost issues stemmed from organisations picking the wrong architecture for the wrong workload, for example by frequently accessing data in long-term storage or putting less-used data on high-performance, low-latency systems and so paying more than they should.

Service providers are further mitigating this through automated tiering. CIOs have to take the logic used for that tiering on trust, but unless a firm has a large and highly skilled storage management team, it is likely to be more efficient and cheaper than any manual process.

  • Security

Organisations will always have their own requirements for data security and compliance, especially in fields such as government, healthcare, finance and defence.

For buyers of cloud services, this means matching a cloud service provider’s offering to the organisation’s baseline security requirements. Again, this is an area where the cloud providers have made real strides over the last few years.

Microsoft, for example, recently published the Azure security baseline for Storage, which is in turn part of the firm’s cloud security benchmark.

AWS has similar standards and best practices, while Google also has comprehensive security guidelines. It is also possible to support processing of specialist data, such as PCI-DSS payments information, personal health data, or even classified files in the cloud.

The good news for cloud services buyers is that security levels among the big three providers match and often exceed those for on-premise data storage.

Read more on Cloud applications