You can take it for granted that the technology industry will produce bigger, faster, more capable devices every year, and that users will lap up the increased power on offer. The storage industry, however, is worried about the consequences that will result from an imminent growth spurt that will accelerate disk drives’ capacity to ten terabytes and beyond.
The spurt will come thanks to advances in a field called “areal density,” an umbrella term that describes the quantity of data that can be stored on a disk drive*.
Areal density has improved markedly in recent years, thanks to innovations in the magnetic materials (disk wonks jokingly call this “rust”) used on the surface of the platters that whir around inside disk drives.
These materials have improved thanks to advances in the science of “magnetoristance,” a quality of some metallic substances that sees them change their electrical resistance when exposed to a magnetic field. Magnetoresistance is useful in disk drives because changing the electrical resistance of disk media is a great way to represent a zero or a one.
|Very large disks vs. tape|
10 terabyte disks vs. tape
The next generation of disk drives will arrive at around the same time as LTO-6, a tape format forecast to have a capacity of 6.4 terabytes of compressed data. 10 terabyte disks will surpass that comfortably, before compression is taken into account.
Disk will therefore, for the first time in many years, have more capacity than tape, plus disk’s other advantages of faster read and write times.
John Martin, a NetApp Consulting Systems Engineer for Australia and New Zealand, believes tape threatened as never before.
“I ran a business that pulled data off old tapes and any tape over two years old had a 25% chance of not being able to be read,” he says. “The way tapes are treated means they have shorter shelf life than advertised. Tape’s longevity claims sound good but no-one has kept one around for 30 years to test them.”
“People tend to be more careful with hard drives,” he contends, adding that disks do not suffer from some of the media degradation issues that tapes suffer. “It is also easier to spin up a disk and do a statistical sample of their performance.”
Martin also points out that because disks have a file system, they can easily be read. Tapes, by contrast, generally rely on a format determined by a particular piece of backup software.
“With the tape you need to keep the backup application, and will you have the software you use now in 30 years?” he asks.
Symantec’s Sean Derrington disagrees.
“I don’t think anyone should take all their tapes to the landfill yet,” he says. “Tape cartridges are roughly $50 and tape drives are about $1500, so tape still has a very significant price per gigabyte advantage,” especially once the cost of storage arrays and the electricity they require is taken into account. Derrington also says that few disks are designed to operate for many years, creating a need for migration from disk to disk over time to ensure reliability.
“But with tape you can be pretty confident about a 25 year migration strategy,” he says, thanks to long roadmaps and broad industry acceptance offered by the LTO consortium.
But NetApp’s Martin believes disk-makers have an answer. “You will find that 10 terabyte disks in removable canisters will be looked at as offline backup media,” he predicts.
“Flash is disk, disk is tape and tape is dead!”
Various researchers have, over the years, discovered materials with greater levels of magnetoresistance, which allows data to be written with greater precision so that a smaller area of a disk’s surface is required to record a one or a zero. IBM famously dubbed one such innovation, its 2001 addition of the rare metal Ruthenium to its disks, as “pixie dust” and boasted of areal density up to 25.7 gigabits (around 3.2 gigabytes) per square inch.
Pixie dust boosted areal density, but was held back by the prevalent “longitudinal” method of recording data that sees information smeared across a disk’s surface. A newer technology, “perpendicular” recording, changes the magnetism of media by sending a charge down inside the “rust” instead of just changing the surface layer’s value. The deeper changes mean that information is written to smaller spaces on the disk’s surface, increasing areal density.
Perpendicular recording is today’s dominant technology, but disk-makers believe it is running out of steam and are investigating two new technologies to bring us more advances in areal density.
One of those technologies is called Patterned Media, an idea which will see disks divided into millions of tiny, discrete, “cells” that are smaller than the areas of disk perpendicular recording currently requires to record data. Patterned Media will therefore be able to pack more data into less space.
The second technology has the spiffy acronym HAMR – short for Heat Assisted Magnetic Recording – and involves new materials that can store data in smaller spaces, provided they are heated before data is written. HAMR drives will therefore include a small laser which zaps the surface of a disk to warm it up before data is written.
Development of HAMR is already advanced. In early 2009, drive-maker Seagate demonstrated a 250GB drive using the technology. TDK demonstrated HAMR technology at an October 2009 trade show in Japan and mentioned one terabit per square inch densities.
Hitachi GST, the main proponent of Patterned Media, has often predicted its debut in 2011 or 2012.
Both companies predict that by 2012 or 2013, their technologies will result in disk drives that easily reach ten terabytes. By contrast, LTO-6 - the largest mainstream tape technology currently envisaged – will reach 3.6 terabytes (uncompressed) in 2012 or 2013. With Seagate and Hitachi GST both optimistic that they can use HAMR and/or Patterned Media to reach 50 terabyte drives before the 2010s come to an end.
Big is problematic
The storage industry, however, is not yet certain what to do with very large disk drives.
Simon Elisha, Hitachi Data Systems’ Chief Technologist for Australia and New Zealand, believes they will meet some users’ needs by occupying less space than their predecessors.
“I was talking to a customer who has decommissioned a [storage] system and archived the data,” he explains. “Only four people will ever need to see the data, but when they do they will need it ASAP, so they want it to take up a very small amount of space in the data centre.”
“That is where people are seeing large drives as having some benefit.”
But size is not important for everyone, says Clive Gold, EMC’s Chief Technology Officer, Marketing.
“Today, we can give you 75 petabytes of storage in a 19” rack, and we are not even using 2.5 inch drives,” he says. “The problem today is speed,” he contends, as it will take longer to read data from a large drive than from a smaller device. “It could take a month to rebuild a 10TB drive in a RAID set,” a period of time that makes RAID – the mainstay of data protection – considerably less useful!
Sean Derrington, Director of Symantec’s Storage and Availability Management Group also has worries about speed.
“The slowest part of the input/output equation is always the mechanical drive,” Derrington says. No matter what the size, he says, “they will all have the same rotational latency,” which means that a very large hard disk may store more information, but will generally be slower at delivering it to applications as the drive will have more work to do to find data.
Faster hard disks could theoretically solve this problem, making larger capacity drives better and delivering data to applications, but faster drives consume more electricity and make more heat. Many data centres simply cannot access the electricity needed for more power-hungry appliances or for cooling equipment to ensure their continued operation, making faster drives unlikely.
Instead, it is likely that large capacity drives will find a home in tiered storage implementations. Organisations will take advantage of their size to tackle space issues, but use them alongside faster storage media like solid state disks to ensure that the data which is frequently required by an application is delivered faster than large mechanical drives can deliver it.
“You can get some flash drives in to boost input/output rates,” says EMC’s Gold. “Maybe between flash and serial ATA drives you have some fiber channel drives in a tiered model that moves data around.
“If I know which data will be hot, I move it to flash, or if it is not used a lot, push it to a lower cost tier of storage,” such as a very large disk.
But doing so, warns Symantec’s Derrington, will mean a need for greater management of storage arrays.
“Organisations will need to look at input/output all the way from the application to the individual spindle [disk] on which data resides,” he says, as only by understanding the capabilities of every storage device will it be possible to tune systems to give applications the performance they require.
“If you do not have a plan in place to manage storage utilisation you cannot optimise it, so if you do not know where the information resides – the serial number of a spindle – you cannot optimise storage.”
Gold also believes that the advent of very large disks may change the way we think about storage.
“We need to think about the format written to the drive,” he says. “Today we take ones and zeros and put them onto blocks and sectors of a disk.” A new alternative is “binary large objects” (BLOBs) which he says “make a drive an object store rather than a block store. Today one of the issues is that metadata is buried in file structure. But if we take a metadata layer and abstract it, we can search an index before we get into the object itself.” The result is faster data retrieval, a possible boost to the performance of a very large drive and therefore a potential route to their wider use.
Another possible change large drives could force, according to NetApp’s Martin, is for “backup as we know it to disappear.”
“It is going to become impossible to use RAID because the rebuild times will be so long,” he opines. “Most file level backups are random, so you will see that style of backup disappear in favour of image style backups.”
* The industry measures areal density in “bits per square inch and has settled on this measure because while drives are named after their sizes - 2.5 inch and 3.5 inch drives are the main form factors on sale today – the disks inside a drive vary in size depending on vendors’ whims.