The goals of tiered storage
Storage administrators understand that most corporate data is only accessed infrequently. Yet storage volumes are escalating each year as corporations generate more electronic content and struggle to meet complex compliance regulations. Expanding storage on the storage area network (SAN) simply by adding expensive FC disks is usually not a cost-effective option. By implementing a tiered storage architecture, an IT organization can dramatically improve storage capacities, manage storage costs and improve performance.
The lure of added capacity is compelling; a 500 GB SATA hard drive costs far less than a 146 GB FC drive. When multiplied out over dozens, hundreds and even thousands of drives, the added storage and lower disk cost can be substantial. Administrators can then utilize this cost savings by migrating lower priority data onto the more cost-effective disks. While SATA disks cannot match the performance of high-end FC disks, businesses typically translate tiers of storage into "tiers of service," offering storage users a more practical measure of reliability and accessibility at each level.
Still, overall storage performance can actually benefit from tiered storage. With all data on a single FC tier, the struggle for access can slow performance. That struggle is eased once storage is reorganized into tiers across two or more storage subsystems. Even though SAS or SATA drives may offer lower performance than FC drives, reduced competition for access may actually allow for good performance at the SAS or SATA tier. And since this also reduces the number of requests arriving at FC drives, top-tier performance may also appear to improve.
Roadblocks to tiered storage
There's no doubt that tiered storage can potentially rein in costs while providing adequate performance, but there are serious issues to consider before moving forward with a strategy. Simply deciding how to organize data within a uniform business process is a challenge that most users underestimate. Data classification is an essential part of tiered storage, helping users to understand the value of their data and its importance to the enterprise over time. However, data classification is largely a manual process -- it demands input and support from every department within the enterprise. Software tools are available to help identify files, automate migration and enforce storage policies, but no software can tell you which data is important. "I would not recommend a tiered storage deployment for organizations that have not developed sufficiently mature processes," says Phil Goodwin, president of Diogenes Analytical Laboratories Inc., noting that most organizations have inadequate policies in place (if any) to classify, store and move data.
Analysts suggest that tiered storage architectures should arise from a careful consideration of each application in terms of access time, data protection, recovery needs, disaster planning and so on. Once those needs are understood for each application, an architecture can be implemented to meet the needs of those applications -- you can better determine which applications belong on each given tier.
There are also many more storage systems to contend with. The days of servers and a tape library are long gone, and organizations must manage a broader proliferation of platforms, including disk arrays, virtual tape library (VTL) systems, content-addressed storage (CAS) systems, remote replication implementations, continuous data protection (CDP) appliances and a large range of tape drive technologies. Not only are there more devices to maintain, there are usually numerous tools required to manage them -- heterogeneous management is still elusive -- so more management time and effort is needed from IT staff, which often divides responsibilities between storage platforms, such as an EMC Corp. Symmetrix versus a Clariion. "There may be some cross-training in there, certainly, but to have one group of people knowledgeable about all of the different systems would not be typical," Goodwin says.
These factors often limit the effectiveness of tiered storage in the enterprise. In many cases, the potential savings is outweighed by the increased maintenance of the hardware and management of applications, like snapshots. For example, Goodwin notes that Tier-2 storage should cost 20%-30% less than Tier-1, while Tier-3 storage should cost 50%-60% less. In actual practice, however, Goodwin only sees a savings of 10%-15% at Tier-2 and only 30%-40% savings at Tier-3. "In some cases it may be a wash, and it may not be worth the effort in other cases," Goodwin says. "There's a trend among large-scale data centers that are now moving back toward Tier-1 types of storage."
Impact of tiered storage
In spite of its potential pitfalls, tiered storage is proving itself in environments that must maintain uninterrupted service levels. For example, the film industry is leveraging storage tiers to process and play back films -- a data-intensive process. One example is Pacific Title and Art Studio in Hollywood, Calif. With over 350 terabytes (TB) of storage to manage, tiered storage has been the key to Pacific Title's movie processing and playback performance. "When we need to play back a movie, we require about 277 megabytes per second (MBps) per movie," says Andy Tran, CTO and senior executive vice president at Pacific Title. "We have multiple clients running at the same time, and a client cannot skip a frame."
The architecture at Pacific Title consists of several key platforms. Tier-1 storage supports real time playback utilizing a DataDirect Networks (DDN) 9550 platform with 4 Gbit FC connectivity. Tier-2 storage handles rendering and other network traffic through an LSI PP9700 system with FC and SATA drives. A StorageTek tape library provides a third tier for long-term archiving up to a petabyte of data. Tran says that the DDN platform has been ideal for guaranteed playback performance, while rendering and frame-by-frame editing tasks can run over Tier-2 storage. Storage management has not been a significant problem for Tran, noting that it takes roughly one full-time person to manage current storage levels.
While Tran reported a smooth tiered storage implementation, he notes that people play a huge role in the process. "Make sure you have the right management people and workflow people," he says. "If you have the wrong data in the wrong place, it doesn't work -- it's a 'human problem.' "
In other situations, tiered storage can be used to dramatically improve workflow. For Ibis Consulting in Providence, R.I., the task is data processing for litigation, taking third-party data from large law firms or other organizations, filtering out information that is pertinent to a case and then converting the data to file types that can be imported into document review tools. With over 250 TB of storage, tiered storage is the key to enhancing workflow. "We use storage like a manufacturing assembly line where we have different disk types and NAS [network attached storage] arrays," says Andrew Beeber, manager of IT at Ibis Consulting. "We read and write data through the process from media upload (where we ingest client data) to the actual deliverable generation."
The move to tiered storage started in 2004 when it simply became too cumbersome to manage the many shares available on NAS systems. Shares were running short of storage, and excess activity across relatively few shares in a given project were resulting in degraded storage performance. The company ultimately implemented an Adaptive Resource Switch from Acopia Networks Inc. in early 2005 to virtualize the NAS infrastructure and streamline data handling processes.
Beeber notes that the biggest challenges have been in managing complexity and integration. "We're very CIFS dependent, and CIFS is a very open protocol, so we had challenges with different types of CIFS hand-offs between platforms," Beeber says. "At the same time, we've been challenged with tighter change control as we introduce new code from our storage vendors." In spite of the challenges, Beeber notes that storage management is handled by a senior engineer dedicating about 50% of his time to management. "I could probably get maybe half a petabyte with one person [handling management]," he notes.
Future of tiered storage
Tiered storage is clearly making a positive impact through improved service levels and cost savings, but it's crucial to consider the potential downsides before embarking on a tiered storage initiative. Savings and improvements depend heavily on a thorough understanding of application needs, and a solid mature process in place to store, migrate and retain data. In addition to the increase in storage capital expenditures, more tiers and devices, like VTL and CDP, mean more management overhead. "It's not the old two-tier system; meaning disk and tape," Goodwin says. "Some relief may come in the form of better enterprise-class management tools or automated data classification, but those tools are pretty complicated in their own right."
Ultimately, IT organizations will need to view tiered storage from a more tangible standpoint. "I think we're seeing a certain amount of disillusionment now," Goodwin says. "Over the next 12-24 months, IT organizations will get beyond the panacea notion of tiered storage and start to see more practical implementations of it." ***