The Large Hadron Collider at CERN (the European Centre for Nuclear Research) in Geneva, which was switched back on on the 21 November following an electrical fault which saw the particle collider switched off in September 2008, is designed to create hundreds of millions of collisions between subatomic particles every second, in an attempt to advance knowledge of energy and matter, but most significantly to piece together the events immediately after the big bang.
Four experiments spaced along an underground tunnel the size of the London city line will produce 15 petabytes of data per year, creating one of the greatest challenges ever in the history of computing, requiring that data be collected, formatted, stored, shared and secured on a massive scale.
On site is a staggering concentration of heavy-duty computing power, which includes 5,700 systems, 36,600 processing cores, 41,500 disk drives, 45,000 tape cartridges and 160 tape drives. But this only begins to touch on the effort needed to support the project.
When the Large Hadron Collider accelerator at CERN is running at full strength, access to experimental data needs to be provided for over 10,000 scientists in several hundred research institutes and universities worldwide participating in the Large Hadron Collider (LHC) experiment. All the data need to be available over the 15-year estimated lifetime of the LHC. Analysis of the data, including comparison with theoretical simulations, needs about 100,000 CPUs at current measures of processing power.
CERN's IT department head, Frédéric Hemmer is helping to drive an ambitious project to develop huge grid computing networks that would support the LHC and many other important research initiatives throughout Europe and rest of the world.
Frédéric Hemmer's former colleague, Tim Berners-Lee, of course invented the web. Now his and CERN's work on grids is itself leading to powerful and interesting innovations leading to major changes in computing and communications; most notably the move towards the cloud.
The Worldwide Large Hadron Collider Grid (WLCG) is a collaboration between more than 170 computing centres in 34 countries. It uses both the Open Science Grid (OSG) in the US as well as the Enabling Grids for E-Science (EGEE), the largest grid in Europe.
As well as particle physics, these grids support research into fields as diverse as climate change, medicine and computer graphics. Hemmer and a number of his peers are presently pushing for a more formalised grid organisation that would link most of the universities and research institutes throughout Europe.
This is one of the goals of the European Grid Initiative (EGI) which in addition to fostering technical collaborations is also trying to help member countries to resolve legal differences and agree on frameworks and standards for privacy and security.
"We are now trying to have something more permanent that would rely more on national efforts," Hemmer explains. Key to this has been the creation of independent National Grid Institutes (NGIs) which represent the research groups and universities of each member country. NGIs are entities with a public mission aiming to integrate funding resources at national level for the provision of grid-based services. They are designed to be a one-stop-shop for a number of common, grid-based services for national research communities.
As of 2007 the EGI had 36 supporting countries. By the time the organisation officially comes into effect next year, it is hoped that it will play a central role in ensuring a sustainable future for the e-infrastructure developed under the series of EGEE projects.
The massive task of data management and resources allocation for CERN and the many other organisations linked to it through various grid architectures has demanded some innovative solutions on the middleware front. In collaboration with EGEE, CERN and other groups have worked to develop what is seen as the next generation of middleware for grid computing. Born from the collaborative efforts of more than 80 people in 12 different academic and industrial research centres as part of the EGEE Project, gLite provides an open-source framework for building grid applications tapping into the power of distributed computing and storage resources across the Internet.
Glite effectively offers researchers two tiers of service. Grid Foundation middleware covers things like security infrastructure information, monitoring and accounting systems, and access to computing and storage resources with the aim of delivering a consistent and reliable production infrastructure. Higher-level grid middleware covers services like job management, data catalogues and data replication.
It is expected that these sorts of open source middleware solutions emerging in the academic realm will increasingly inform the development of applications in the corporate environment running on Linux and other platforms in the future.
Naturally security is a major concern with grids, especially, as is often the case, where breaches of sensitive research data could be very embarrassing or damaging. In the cloud environment, many make the argument that service providers are at pains to deliver better security, lest their own commercial reputations suffer. In the grid environment, on the other hand, the drivers are somewhat different, as well as constituting a different approach to the management and distribution of information
"When we look at reliability and security of grids, the problem is not the grid itself; the difficulty is in securing the sites," says Hemmer. To address these concerns, the EGEE is conducting a wide ranging security trial across several of its members' sites in an effort to tighten the system. "The efforts we are putting in will ensure that the sites are secure and able to react to security incidents."
Strong security is of course essential for there to be the sort of widespread collaboration envisaged by the EGI.
In the cloud environment, on the other hand, it's not the sharing information that causes concern, rather it's the fact that businesses usually don't know where their data is being kept.
A key difference between cloud and grid computing is that grids are designed to foster collaboration between users while users in the cloud are invisible to each other.
The way in which resources are allocated is also different, with cloud services structured much like utilities whereby the user simply pays for how much they use. Users of grids on the other hand typically make sporadic, yet, very large requests for resources.
However, analysts note that the lines between the two are beginning to blur as researchers look to bring more flexibility to grid infrastructures, while in the cloud businesses are beginning to discuss opportunities for collaboration. "Now cloud computing has come into focus it's starting to blur the difference between grid and cloud computing," explains Vuk Trifkovic, senior analyst with Datamonitor. "Some of the cloud platforms in the future will be built in a similar way to grid computing, like consortia. In time, with cloud computing we'll see a much higher degree of collaboration."
He adds: "As the technology stabilises through improved international standards and reliability, grids are gaining acceptance in mainstream business and science communities." CERN's Hemmer observes a trend towards industry-specific cloud platforms which operate much the same way as grids, only that the users still essentially rent the infrastructure and applications from a service provider. The benefit of this is that things like capacity and security can be more easily guaranteed.
However, it is expected that the pricing and business models currently in use in the cloud will look very different in years to come, as more industry- and application-specific services emerge and companies demand more flexibility.
Conversely, various grid architectures around the world have begun to incorporate certain features of cloud computing. In fact CERN and others are investigating the feasibility of layering cloud services from Amazon and others over the grid.
Hemmer explains however that there are considerable sensitivities associated with doing this, particularly with regard to data being held in dispersed or unknown locations. "This is something that some funding agencies for science do not like; they prefer to install in their own country."
Nevertheless as CERN and thousands of researches around the world prepare for the relaunch of the massive LHC project, it's somewhat ironic that the vast grid networks of which so much of its success depends may at times find themselves leaning on the very cloud services they essentially gave birth to.