IT departments have a delicate balancing act to perform. They face pressure to rationalise, standardise and consolidate applications and systems to reap the benefits of centralisation strategies. At the same time, however, IT infrastructures are becoming more decentralised, and pressure is always mounting to use physical IT resources as efficiently as possible.
Distributed systems offer a route to centrally controlled computing power with geographically dispersed hardware. Some systems, such as on-demand smart software clients, have harnessed the power of the web to distribute applications in a way that reduces the need for on-site maintenance by IT staff.
Other approaches to distributed resources, such as grid computing, have focused on the infrastructure side of the problem. Approaches of this type aim to tackle the growing cost of powering the vast IT environments of today's modern enterprises by making use of available computing power through a network.
From a technology perspective, the ability to run different parts of a computer program at the same time on two or more networked computers defines distributed computing. But what has made it an attractive solution for high-performance computing needs is the parallel computing capabilities it offers across different file systems and hardware components.
Increased computing power, better computational techniques and widespread adoption of messaging standards - using Internet Protocol (IP) addresses for example - have facilitated many different types of distributed computing.
There are challenges in managing and controlling distributed systems, including clusters, grids and even distributed storage systems. Leslie Lamport, a researcher on timing, message ordering and clock synchronisation in distributed systems, said, "A distributed system is one on which I cannot get any work done because some machine I have never heard of has crashed."
Distributed computing environments rely on the natural evolution of faster, more resilient networking technology to allow efficient sharing of processing demand across the infrastructure. Grid computing is the most commonly used form of distributed computing.
The transfer of information across a network is the essential difference between a grid computer and a conventional supercomputer, and it is this restriction that should influence decisions relating to what applications are suitable for the grid.
Gartner analysts Mike Chuba and Carl Claunch have described the infrastructure of a distributed system as, "multiple resource ownership and a single purpose". They observed that widespread adoption of grid computing had been held back by the complexity of issues in designing and managing grids.
"It is not just technology limitations or the issue of complexity - you cannot just go out one day and buy a grid - it is also the political or organisational issues that arise when multiple owners with possibly conflicting priorities are involved," Gartner said.
In a research note, Chuba and Claunch outlined the value of grid adoption: "The motivations for using a grid to create a more powerful, larger, single virtual system, or to produce a less expensive alternative of the same size as the system it is replacing, are powerful factors that compel many organisations to look at possible grid systems."
And organisations are not just limited to the traditional high-performance computing enclaves of governmental and commercial projects that focus on large-scale research and logistics.
Users of high-performance computing where a grid could be used include a manufacturer wanting to carry out complex design simulation and analysis, or an insurance company wanting to model financial risk. "The ability to create a virtual supercomputer that is faster than the fastest traditional design opens the door for sizeable long-term rewards," said Chuba and Claunch.
When eBay needed computing power to support the 222 million registered users who add six million new items every day, a grid system was chosen. Paul Strong, research scientist at eBay and Open Grid Forum vice-chairman, said the company has been using grid technology for some years.
He explained that the concept of distributed computing within the organisation had matured through the use of grid computing, as other technology and business requirements contributed to it being a key pillar of the company's IT strategy.
Strong sees grid computing as an intellectual approach as much as a specific set of technologies. "Grid is often used in a generalised context to refer to distributed computing. But essentially, I see it as exploiting the resources of the network to solve problems, with access to more computing resources more quickly," he said.
"In one way, you could consider all IT infrastructures to be grids. eBay runs a range of applications, from the search and transactional platforms to more back-office functions," said Strong. The systems that power eBay include a 15,000-strong server estate, half of which is located in the US and half in Europe. The system has 600 database instances in production, with a number of these managed in clusters of 100 or more.
The ability to scale an existing system is another important consideration in grid computing. At eBay new code can be run over a 100-node test version of eBay.com and then installed on the live grid in less than 30 minutes, said Strong.
Both Lamport and Gartner stress the complexity involved in managing grids the size of eBay's, and Strong admits that the mainstream market still has much to do in building products that are robust enough to support large-scale distributed computing strategies. "We buy products, and we break them," he said.
"We ended up building a whole load of management applications, and we have 20 to 30 other management applications that are homegrown. But that is not to say we would rather replace and integrate them with off-the-shelf products.
"Building enterprise-management frameworks and tools is not our core business, but over the past couple of years we have realised there are no tools out there with enough functionality."
Strong's call for better grid systems management tools reflects the general need for applications that work well with distributed systems to produce business continuity.
"There are obvious benefits in having a shared, heterogeneous grid, so there is a broader context in which all distributed computing happens, and that is the notion that the very network or fabric becomes the server," Strong said.
"Applications like service oriented architecture and web services can be said to form a platform that acts as an integrated datacentre in itself. I think the term infrastructure information system is better. But the issue is a need to shift away from using the network for greater resilience to how you are actually going to manage the thing."
Recent changes in network management software may offer eBay some hope. Oracle's current 10g database products have supported the industry move from Risc and Unix-based servers to low-cost, high-density ones based on open standards, for example.
Chuck Rozwat, Oracle server technologies executive vice-president said, "Grid helps bring together resources at the application, database and storage levels and share those resources across workload requirements. This enables better predictability [in capacity planning] and less cost because you need fewer software licences."
Oracle 10g uses a virtualisation approach to split data storage from the database transaction and process layer. And the clustering and virtualisation technologies being developed look set to support this.
Scott Reynolds, technology consultant for systems integrator Morse, said that thin clients work on the same distributed premise, and that this is proving to be an attractive option for organisations. "The ease of support, maintenance and consolidation can give IT managers better control of their IT environment. This makes upgrades easier, as opposed to having to physically do each upgrade on the actual PCs" he said.
He pointed to the recent activity by Citrix, which at the beginning of this year announced that it had completed the acquisition of on-demand software platform developer Ardence. The move was a bid by Citrix to improve its capability to enable IT administrators to provision PCs, servers and web services on-demand from a centrally managed source.
Citrix says Ardence software can support a dynamic desktop infrastructure, in which both the operating system and applications are delivered to a bare-bones machine from virtual discs on a centralised server.
The use of J2EE Java web services has also encouraged the uptake of more distributed computing models at the application level. "Some would say grid is only there at the hardware level," Reynolds said. "At the same time, it may be valid to say virtualisation is acting in a similar way at the software level now, but it is still very early days. Who knows, the Citrix and VMware markets may end up merging if the demand is there."
Most analysts and experts agree that distributed computing is likely to become more common as pressure to optimise data handling increases. Reynolds said, "There is a lot of work going on with suppliers around SOA in the utilisation of memory and processor power. SOA allows you to consider the need for extra resources.
"But the business model for delivering IT has to change with it. Reducing complexity, introducing products to automate both the software and hardware, and using dashboards and tools to manage the rules you apply will be the best course for anyone with a distributed IT environment."
Looking ahead, the convergence of virtualisation, web-based application delivery and web-services based technologies looks set to continue in parallel with the development of ever faster and more powerful computers and networks. Whatever IT strategy you implement, the chances are it will involve a distributed computing model.