On the heels of a billion-dollar investment from IBM in a
new "green" storage program, Clod Barrera, engineer and chief
technical strategist, IBM Storage, discusses energy efficiency,
power consumption and cooling issues, all of which are becoming hot
issues (literally and figuratively) in the storage
industry.What are the 'green' issues particular to storage
systems?
Clod Barrera: There is a very large appetite for storage
capacity and increasingly a large appetite for quick access to
data, so a lot of customers are talking about backing up to disk
rather than on tape, and disk is of course much more power hungry
than tape. There's more and more talk about online archives. People
are talking about petabytes of spinning disk to accomplish this.
Disk array controllers and
[
storage area network] San switches and host bus adapters for
storage will also become more and more power hungry over time as
speeds increase.
One of the habits a lot of people fell into when storage was just
getting cheaper by half every year is that they stopped managing
storage with real science and real tools. It became easier just to
deploy a lot of it so you would be sure not to run out. That was in
retrospect a bad habit to fall into. Back in the days of mainframe
we had a lot of tools and process to manage storage to as high a
capacity point as could be balanced with performance requirements.
People got medals for reaching high utilisation with high
performance on mainframes 20 years ago, but those practices didn't
get adopted in distributed environments. Now, even if I can still
afford the [capital expense] of hardware, I can't afford the power
-- there's a significant value-add for doing a
better management job.
The simple projections of where this goes over time become quite
daunting. The trend is clear -- if the storage industry is going to
both meet the demands users have for high-performance access to
information and not let the power bill get out of control, we're
going to have to do some innovation.
Do storage arrays from different vendors really draw
different amounts of power? How significant is that?
Barrera: One of the discussions we're having within SNIA
is how to get really good numbers for that purpose. The simple
answer is yes, of course, since different arrays are designed by
different people for different purposes, but we do not have enough
of a standard yardstick to determine the goodness or badness of any
particular box compared to another. There actually is considerable
variability. Currently within SNIA, the first project is to
establish a set of yardsticks, what do you measure and how, and
also how many do you need. One good one is watts per gigabyte
stored, but there are a lot of environments where customers don't
deploy on the basis of watts per gigabyte or dollars per gigabyte,
they deploy against ops per second. So you probably need a watts
per ops per second yardstick for true high-performance
transactional environments, and maybe you need a watts per gigabyte
per second for streaming environments and maybe a different
yardstick for archival environments.
We need to argue this out, and we need to do it quickly because
we're better served by experts deciding what the right metrics are
instead of using a single number that's not a good basis for
comparison. We also ought to have a certification process somewhere
so your box can be measured and verified by a third party and then
we'll really know.
How realistic is that, though? It's not in a vendors' best
interest to have real numbers come out, especially if trying to
sell storage hardware.
Barrera: Once everyone knows real numbers, the race will
be on to build better products for customers. If you have 20% more
efficiency, you can compute what it's worth in dollars for the
customer and compute a better margin. I personally think there will
be enough social and regulatory pressure so that these yardsticks
will come into existence one way or another. My goal is for an
organisation that knows how to do this, like SNIA, representing the
right entities, to decide what those yardsticks need to be. I think
there'll be plenty of help from the rest of the world on the
enforcement process.
What trends do you see in storage hardware and data
management for green issues over the next year?
Barrera: To me the single biggest rock in the pond early
is going to be the notion of measurements and standards. Once
measurements are in place, you're going to see significant
innovation to drive down power consumption and drive up heat
efficiency -- the nice thing about numbers is they encourage a lot
of behavior. The technologies that will be important over the next
year, however, are all already in place -- things like storage
virtualisation, which drives up utilisation, information lifecycle
management [ILM], which moves data to lower power media and data
deduplication and reduction. It will probably take about a year to
get the right measurements in place, another year for best
practices and over time products will get better and better. We'll
probably have a good place to look back from in five years.
What options does IBM have in terms of 'greener' components
for storage systems? What's on the roadmap in storage?
Barrera: There are power efficiencies to be had at the
hardware level, just build a more efficient box. Silicon guys are
working on those problems. We in the storage industry tend to
inherit those technologies.
Deduplication is starting to show up around the industry
particularly in archival data. It takes a fairly large amount of
computational horsepower, but it's getting cheaper and cheaper, and
when you start talking about large amounts of storage, the value
for taking that step is going to be there. It'll happen
specifically by application types because it turns out that with
different applications there are different processes. The use of
the intelligence that you have in computers to save the smallest
amount of data and commit the least amount of watts when you do
operations is going to be an ongoing effort.
IBM is still holding back on adding deduplication to its
virtual tape libraries [VTL]. Is there any kind of timeframe there
yet?
Barrera: We think the onus on IBM is that by the time we
bring a new technology to market, it has to be perfect. Startups
get away with more than we can. They perform a great service in the
industry as technology trailblazers, but customers buying from us
expect nothing unexpected. So it's important to us to make sure
things are really ready.
What's the ultimate solution to energy problems in the
storage industry? An alternative energy source? Different data
recording technologies?
Barrera: First of all, electricity is what all this stuff
is going to be based on. There's one discussion to be had around
where electricity comes from, and burning fossil fuels better not
be the answer for very much longer. Within the data center itself,
we've been in this lust for performance and availability of data.
We're going to add another lust, and that's going to be energy
efficiency, and those things will find their relative importance
compared to one another. It will become a new arena for competition
between vendors and that's how you will see improvement.
What about replacing spinning drives with solid-state
memory?
Barrera: With today's technologies, that's really only
possible where the goal is to achieve a certain level of
performance if I have a relatively small capacity requirement, like
the kind of memory you need in cell phones. Solid-state disk is
actually going the wrong way in terms of power utilisation today,
but there are things coming, a variety of technologies being talked
about in terms of storage-class memories. The idea of those taking
over for disk technologies are five and probably more like 10 years
away. Disk drives are going to be around for a lot longer yet and
so will tape.