You don't have to be Einstein to realise that interoperability is the way forward for storage products, and market leader EMC is showing the way with new innovations. Nicholas Enticknap reports.

EMC is the world's largest specialist IT storage company, selling disc storage subsystems plus associated software. According to analyst firm IDC, the company is number one in the storage area network (San) and network attached storage (Nas) markets, and is also top in the direct-attach markets for both Unix and Windows NT/2000 servers. Gartner Dataquest figures show that EMC is the largest supplier in the storage management software market.

EMC started life in 1979 selling add-on memory for IBM mainframes. The success story did not begin, however, until the launch of the Symmetrix Integrated Cached Disc Array (ICDA) in 1990.

Up to this time mainframe disc subsystems were quite different from those used on other systems. They were larger, with massive 14-inch platters; they provided much faster access to data; they were engineered for maximum reliability; and they were much more expensive. The key to EMC's success was its recognition that the performance and reliability levels demanded by mainframe users could be achieved using off-the-peg 5.25-inch drives.

EMC provided the necessary reliability by mirroring. It achieved the performance required by using much larger caches than competing products. This recipe worked, and EMC's effective selling and marketing brought the company to market leadership by 1995.

EMC has maintained this position. First, it recognised that mainframes were increasingly being replaced by large Unix systems, and adapted Symmetrix for them in 1995. Second, it recognised that disc systems were developing into subsystems, and introduced software products to add functionality.

The first of these, Symmetrix Remote Data Facility (SRDF), for mirroring over distance, appeared in 1994. Another major product is Timefinder, launched in 1997. Both have sold well, as has the Controlcenter automation product, with more than 40,000 copies installed.

However, contrary to the claims of its marketing machine, EMC did not pioneer open systems attach, or the addition of software functionality. Encore led the way in multi-platform connectability with the Infinity subsystem, introduced in early 1995. IBM introduced software including peer-to-peer remote copy on its 3990 disc controller in the late 1980s. The first product to offer Timefinder's capability was StorageTek's Snapshot, introduced in 1992.

EMC's success here stems not from pioneering but from its timing and then its selling. It read the market right on each occasion.

EMC was also early into the Nas market. Here its product is Celerra, which is effectively Symmetrix with a different front-end to permit file sharing. The company enhanced Celerra with the introduction of a software product called Highroad at the end of 2000. This allows a Nas system to operate at the speed of a San where possible.

Celerra, like Symmetrix, is a top-end product. EMC did not enjoy anything like the same success lower down the scale until it decided to acquire Data General and its Clariion disc subsystem range in 1999. EMC has continued development, with the latest Clariion being the FC4700, introduced in January 2001. However, EMC is still a long way from the market penetration it has in its top-end markets, with just 5% of the global mid-range market.

One successful outcome of its acquisition of Data General has been the development of a Clariion Nas product, known as the IP4700, which was launched in December 2000. Sales of this product, together with Celerra, elevated EMC to number one in the Nas market during 2001, as measured by both IDC, with 42% market share, and by Gartner Dataquest, with 48%.

It should be said that erstwhile market leader Network Appliance disputes these findings. "If you look at EMC's Celerra, one of the ways they have been able to report substantial sales is that they count the whole Symmetrix as revenue, whereas typically only a small portion is used for Nas," says Stuart Gilks, Network Appliance's Northern Europe SE director.

For future growth EMC is looking to its first truly innovative offering since Symmetrix. This is Centera, a type of product that EMC calls content-addressed storage, which was launched at the end of April (see right). EMC's prospects for future growth depend heavily on market acceptance of Centera. But this will take time. "This year Centera will not account for 10% of our revenue, but it will be significant," says EMC's president and chief executive Joe Tucci.

The major challenge facing the company is the need to demonstrate that its disc subsystems are truly open. "EMC has to get more into the open market," says Robin Burke, principal analyst at Gartner Dataquest. At present, software products such as SRDF are proprietary: you need a Symmetrix at each end.

EMC recognises that openness is now a market requirement. "We are now opening up our software from a management perspective to other people's storage devices," says EMC's technology analysis director Karl Steinhardt.

EMC launched an initiative called AutoIS (Automated Information Storage) last October to implement this. AutoIS is based on a new version of Controlcenter called Open Edition, which is a framework for plugging in various automation products. A key element, called Widesky, is middleware that will allow external suppliers of both hardware and software to interface with EMC's products, as well as allowing EMC software to run on non-EMC systems.

EMC continues to prosper, and will continue to figure heavily in companies' storage plans, it seems.

The next growth engine?
EMC's new Centera system is designed for the storage of fixed content - data items such as cheque images, medical scans and movies which never change. EMC says this type of content accounts for an increasingly large amount of the data stored online, and estimates that it will represent 50% of all data stored online by 2005.

The innovative feature of Centera is that the address of each data object stored is calculated from the data content of that object. This contrasts with traditional systems, where a data object's address describes only its physical location.

Using the content to calculate the address brings several potential benefits. For example, applications do not need to know anything about the storage device in use. Also, it is easy to tell whether the data stored has become corrupted, because the algorithms used to calculate the address can be rerun at retrieval time, and should produce the same answer.

