Storage Area Networks for Internet ventures

While the costs to set up a Storage Area Network (SAN) may seem daunting to many Internet ventures, the potential benefits it can...

While the costs to set up a Storage Area Network (SAN) may seem daunting to many Internet ventures, the potential benefits it can provide may well justify the additional expense

Setting up an Internet business in these uncertain times is a risky business indeed. When entrepreneurs are seeking funding for proposed ventures through a game show, it's quite obvious that things aren't at all rosy in the garden of e-commerce. While venture capital firms may now be weeding out the poorer web ideas that are presented to them - although certainly not all - it is also a lot harder for those with genuinely good business propositions to get the backing they need. Following the stock market crash of Internet firms after's IPO and the collapse of ventures such as, investors have become a great deal more wary of what they are getting themselves into.

Almost the same situation can be said of established companies looking to introduce or expand an e-commerce side to their business. Every aspect of an Internet business is undertaken with a lot more consideration nowadays, unlike the heady days of e-commerce only a few months back, when the only aim was to get out into the market and recognised.

Whether you are a start-up or an established company with an Internet division, it is almost certain that the funding you are looking for will be far more difficult to come by, and when it does it could well be far below the amount desired. In these situations you are going to have to make every penny count, meaning that you will need to look for a cheap and easily manageable way to set up your infrastructure.

Similarly, the rise of an Internet company can be meteoric or non-existent. If a site becomes well respected and popular, the number of requests it receives per day can increase by several factors in a very short space of time. In order to handle this, a firm's web infrastructure needs to be highly scalable, starting perhaps quite small but being able to expand quite quickly. This needs to be done while making the most of the technology that is currently in the set-up. A wholesale reconstruction of the infrastructure every time the number of hits increases isn't going to help the cash flow situation and the downtime this requires will be unpopular with customers.

One area that is often overlooked, in terms of importance to an e-commerce venture, is the back-up and storage requirements of the systems involved. While possibly not the most vital cost component of a website infrastructure, the money involved can become very significant indeed as the site starts to grow, and management of all the storage drives required becomes increasingly vital as the size of the site and relevant data continues to grow.

Traditionally, a site will handle demand through load balancing techniques. While different users may enter the site through the same gateway, they may often be diverted via a layer 4 switch to different servers in order to avoid congestion. This way, it is easy to scale the site in terms of computing power. If the site needs more CPU cycles, the administrator can simply purchase the necessary server hardware and deploy it into what is increasingly becoming a load balancing server farm.

In this model it is also common to be using directly attached storage. This is where each web server has its own storage device hooked up. The storage device, whether it be a tape drive, disk system or other, is linked only to one web server and nothing else. Thus, every time a new web server is added in this configuration, a storage device will also need to be added. This set-up poses a number of storage issues.

If there is any change to the web data - whether more is added, pages are updated or data is removed - each and every storage device will need the changes implemented on them. This also has to be done in a manner that ensures that all the web servers are running off exactly the same data all of the time - not a simple task. There is also the issue that as the website changes and expands, more storage space may be needed. Where storage is directly attached in a load balancing web server farm and the capacity of those servers is reached, each server will require additional storage deployed to it. This is both costly, due to additional hardware having to be bought for each server, and clumsy, with each server having to be taken down to integrate the extra storage devices. Add to this the costs of adding extra storage every time a new server needs to be installed to improve CPU processing power, and what seemed to have been an easy way to manage the website has become complex and costly.

When looking at alternative topologies to directly attached storage in a website infrastructure, SANs may not initially seem like the cheapest and most cost effective way to use the venture capital provided, and certainly the technology is still relatively new and therefore not the cheapest. The set up of a Fibre Channel network dedicated to the company's storage needs however, may be the ideal topology for companies, such as new web firms, who may need to expand quickly while fully utilising existing technology.

In a SAN, instead of each web server having its own directly attached storage device, each web server is linked to a Fibre Channel switch or hub, which in turn links to a cluster of storage devices. These storage devices in turn can be JBOD (Just a Bunch of Disks) drives in an arbitrated loop formation, fibre channel disk arrays or tape drives attached by a SCSI-FC bridge. While the initial investment in Fibre Channel storage technology may cost more than directly attached storage for smallish sites, the benefits of the SAN are many.

The advantages of Fibre Channel itself is that it tries to bring the best aspects of networking, such as large address space and scalability, together with the high-speed, low latency and hardware error detection of I/O channels. It can also allow multiple protocols, like IP and SCSI, to be used over one infrastructure. The biggest advantages of all, however, are shown in the layout of a SAN.

Since all servers can, if desired, access all of the storage devices via the SAN, it is only necessary to store the data on one storage device. This is, of course, without taking into account fault tolerance, but even if the data is stored elsewhere, it is still a great deal more economical that storing numerous images of the same data on back-up devices attached to every server. Alongside eliminating unnecessary duplicate images, it also brings economies of scale by being able to store large amounts of data on just one device that can be accessed by everything rather than the many smaller devices in a directly attached storage topology.

SAN also makes the storage element completely independent from the server part of the network, allowing servers to be upgraded or added while leaving the storage untouched. Similarly, storage can be added to the SAN without interfering with the operation of the web servers. This leads to minimum downtime and service disruption, a vital area for Internet companies in this era of continuous availability.

As discussed earlier, scaling up a system with directly attached storage is both expensive and difficult in terms of interrupting the system. SAN doesn't suffer from this problem. Storage can simply be added to a switch without fuss. When the switch is full or the topology requires a more complex structure, adding another switch is reasonably straightforward and also has the advantage of increasing the switching capacity of the SAN without degrading the performance (something which happens on, say, a loop topology). Because of this scalability, features such as fault tolerance and hot backup can also be added with relative ease and so only need to be thought about when there is sufficient financial incentive to do so.

It is this type of scalability and adaptability that lends itself so well to Internet ventures, as shown in the last few months. Things can change so quickly in this fast-paced business - you don't want to get caught out by a sudden surge in demand requiring a complete overhaul of your systems just a few months after the last one. Yes, SAN is new and still developing. Yes, it is rather expensive to implement initially, but that initial step into the world of SAN needn't be a large one. In essence, all you need to configure a SAN are Host Bus Adapters for connecting each server to the SAN, Fibre Channel storage such as JBOD or Fibre Channel RAID, a SCSI-FC bridge to allow SCSI storage devices to connect to the SAN and networking components such as a switch. This could start very small with perhaps just a couple of host devices connected to one SAN storage device. From there, how the SAN grows really depends on how the entire enterprise grows.

Of course, SAN may not be right for every Internet company, and there may be many who just cannot justify the cost that setting up a SAN will incur. For others however, who may not have even thought about the possibility before, a thorough cost-benefit analysis of implementing a SAN may reveal a few pleasant surprises.

Paul Grant

Read more on Networking hardware