
This month (August) the world's biggest particle
accelerator, theLarge Hadron
Collider(LHC), will begin hurling subatomic
particles called
protons around a
27km circular tunnel running beneath the Swiss-French border,
before crashing them into each other. By doing so, particle
physicists hope to learn more about the physical universe. At the
same time, they are reinventing the way they share their research
with each other.
An international initiative that will involve more than 2,000
physicists from 150 research institutions in more than 30
countries, the LHC is being managed by the Geneva-based particle
physics laboratory
CERN,
the European Organization for Nuclear Research. Founded in 1954,
CERN has a distinguished scientific pedigree - it has been home to
three Nobel laureates, and is where in 1990 computer scientist
Tim Berners-Lee invented the
World Wide Web.
The objective of particle physics - also known as high-energy
physics or
HEP
- is to study the tiniest objects in nature to answer two
fundamental questions: What is the world made of and what holds it
together? To answer these questions, it is necessary to recreate
the conditions that prevailed at the time of the
Big Bang -
which is the aim of particle accelerators.
Physicist
Rolf-Dieter Heuer
(pictured), who becomes CERN director general in January, explains,
"We want to unravel the secrets of the microcosm and of the early
universe. The LHC has the highest energy ever obtained in a
collider, and so will bring us closer to the Big Bang and to the
early universe."
A key objective will be to test the so-called
Standard
Model of particle physics - the best theory currently available
to explain the fundamental interactions between the 12 elementary
particles that make up all matter and the four fundamental forces
that cause these particles to interact.
At the moment an important cornerstone of the Standard Model is
missing: it does not explain how matter and force particles get
their mass. "The Standard Model only works for massless particles,
and we know that [with a few exceptions] the fundamental particles
of the universe are not massless," says Heuer.
Higgs mechanism
To explain this missing component, physicists have postulated
the so-called
Higgs
mechanism. "We know from theory, and we know from our precision
tests, that the answer to the question of how particles gain mass
must lie within the energy reach of the LHC," says Heuer. "If we
don't find it, the Higgs mechanism doesn't exist, and theorists
will have to find another theory to explain how particles acquire
mass."
Physicists also hope to gain some insight into
"dark matter"
- matter that is invisible but whose presence can be inferred from
its gravitational effects on visible matter. The Standard Model
assumes that less than 5% of the energy and matter content of the
universe is the visible universe, with more than 95% consisting of
dark matter and
dark energy.
"Our hope is to get a first glimpse of dark matter," says
Heuer.
It has taken six and a half years to build the LHC, at a cost of
£4.75bn. But creating the world's most expensive collider is just
the first challenge posed by the LHC. Another is how to collect and
analyse the data.
The LHC will accelerate two beams of particles in opposite
directions at the speed of light. When they reach a sufficiently
high energy level, they will be crashed into each other in a
collision that will rend the original protons, sending off a spray
of other particles. Monitoring these collisions will be a series of
enormous experimental instruments called detectors, including one
as high as a five-storey building, called
ATLAS, and another the size of 40 large aeroplanes, called
CMS.
As particles pass through these detectors, they will be counted,
traced and analysed using extremely sensitive equipment. The
trackers of both ATLAS and CMS, for instance, contain silicon
wafers. As charged particles pass through these wafers, they will
give rise to electrical signals that will betray their passage.
Outside the trackers are
calorimeters,
which slow down and absorb the particles, measuring their energy.
The time it takes for a particle to pass through will also be
minutely calculated.
The LHC will need to be both the coldest and the hottest place
in the universe. To create the very strong magnetic fields needed
to achieve
superconductivity,
for instance, the temperature will have to fall to -271.25°C. Then,
when two beams of protons collide, it will soar to 100,000 times
the temperature of the sun.
Volumes of data
The next challenge will be managing the huge volumes of data
generated. "There will be 40 million collisions a second, but we
can only afford to write a tiny fraction of them to tape," says
Salvatore Mele, a project leader at CERN.
Because many of the events observed in the detectors will be
unremarkable, the secret lies in homing in on the unusual, and
recording only the 200 most interesting events every second. Even
so, about 15 petabytes of data will be generated annually. If
stored on CDs, this would create a 20km-high tower of discs.
Once collected, the data will be processed and used to perform
complex theoretical simulations, a task requiring massive computing
capacity. The problem, says Heuer, is that "no science centre, no
research institution, and no particle physics lab in the world has
enough computer power to do all the work".
CERN will distribute the data to a network of computing centres
around the world using a dedicated
computing grid. This will allow the workload to be shared, and
ensure there are multiple copies of the data stored in case of
failure.
But the biggest challenge will be how to store the data in a
format that allows reuse. Historically, when a HEP experiment
ended, the data was abandoned. But because it is costing £4.75bn to
collect the LHC data, it would be profligate not allow reuse.
"Ten or 20 years ago we might have been able to repeat an
experiment," says Heuer. "They were simpler, cheaper and on a
smaller scale. Today that is not the case. So if we need to
re-evaluate the data we collect to test a new theory, or adjust it
to a new development, we are going to have to be able reuse it.
That means we are going to need to save it as open data.
Formulising knowledge
The problem, says Mele, is that HEP data is generally written in
"an experiment-propriety non-standard format" that only those
working on the experiment understand. Also, this knowledge resides
only in scientists' heads, and is forgotten once an experiment is
finished. So the answer lies in formulising the knowledge and
embedding it in the saved data. But for the moment, no one knows
how to do this. "It remains a challenge for the techies," says
Heuer.
Openness is not an issue for data alone, however. The research
papers produced from the LHC experiments will also have to be open
- which presents a different kind of challenge.
Today, when scientists publish their papers, they assign
copyright to the publisher. Publishers arrange for the papers to be
peer-reviewed, and then sell the final version back to the research
community in the form of journal subscriptions.
But because of an explosion in research during recent decades,
along with rampant journal price inflation, few research
institutions can now afford all the journals they need. "Journal
prices are rising very strongly," says Heuer. "So the reality today
is that lots of researchers can no longer afford access to the
papers they need."
This problem is not unique to particle physics - it affects the
entire research community, and has given rise to the
Open Access (OA) movement, which
calls for
all peer-reviewed scientific literature to be made freely available
online.
Peer review
As the LCH countdown began, the HEP community launched a number
of OA initiatives. In 2006, for instance, CERN
spearheaded a new project called
SCOAP3, which aims to pay
publishers to organise peer review on an outsourced basis, thus
allowing published research to be made freely available.
Funding bodies and research institutions are being asked to
redirect the money they currently spend on journal subscriptions to
a common fund managed by SCOAP3. Publishers will then be invited to
tender for the peer review services they already provide, but
without acquiring ownership of the research. The services will be
paid centrally by the SCOAP3 consortium, and the papers placed on
the open Web.
Essentially, it is a radical plan to "flip" the entire HEP
journal literature from a subscription-based model - in which a
paywall is erected between scientist and research - to an Open
Access model. "The aim is to make all HEP journal articles free to
read and reuse as we want, and at the same time alleviate the
serials crisis," says Annette Holtkamp, an information professional
at DESY, Germany's
largest HEP research centre, and a member of the SCOAP3 working
party
A second initiative will see the creation of a free online HEP
database called
INSPIRE.
This will be pre-filled with nearly 2 million bibliographic records
and full-text
"preprints"
harvested from existing HEP databases such as
arXiv,
SPIRES and the
CERN Document Server
(CDS).
If SCOAP3 proves successful, the final full-text version of
every HEP paper published will be deposited in INSPIRE, making it a
central resource containing the entire corpus of particle physics
research.
Database model
This suggests scholarly publishing is set to migrate from a
journal-based to a database model, and one likely consequence will
be the development of
"
overlay journals". Instead of submitting their papers to
publishers, researchers will deposit their preprints into online
repositories such as INSPIRE. Publishers will then select papers,
subject them to peer review (for which they will levy a service
charge), and "publish" them as Web-based journals - although, in
reality, the journals will be little more than a series of links to
repository-based papers.
"INSPIRE would be an ideal test-bed to experiment with overlay
journals, because it will contain the entire corpus of the
discipline," says Holtkamp.
At the same time, more and more research will take place, and be
"published", on blogs, wikis and in
open lab
notebooks. "The classical journal article will not remain the
main vehicle for scholarly communication," says Holtkamp. "In the
future we can expect to see different materials and media used at
different stages of the research process."
What is key to current developments is the belief that
scientific information must be openly available. Because science is
a cumulative process, the greater the number of people who can
access research, critique it, check it against the underlying data
and then build on it, the sooner new solutions and theories will
emerge. And as
"Big Science"
projects like the LHC become the norm, the need for openness will
be even greater because the larger the project, the more complex
the task, and the greater the need for collaboration - a concept
neatly expressed in the context of Open Source software by
Linus' Law:
"Given enough eyeballs, all bugs are shallow."
Holtkamp adds, "I am pretty confident that Open Access will be
the standard of the future for scientific papers, although it
remains unclear when Open Data will become the norm."
Certainly, if the public is asked to fund further
multi-billion-pound projects like the LHC, there will be growing
pressure on scientists to maximise the value of the data they
generate - and that will require greater openness.