Cern, the world's largest particle physics laboratory, is
planning to use the large database support in Oracle 10g to store
the massive amounts of data generated by its latest advanced
particle accelerator, which is due to go live by
2007.
Cern plans to store between several hundred terabytes and a few
petabytes of data. Over 10 years, the database would need to store
100Pbytes, said Jamie Shiers, database group leader at Cern.
An Oracle customer for 20 years, Cern has rolled out the technology
throughout the organisation, using it to book meeting rooms, manage
the network and power e-business operations.
The organisation has transferred 300Tbytes of data from an object
database to Linux to Oracle since 2002. It is using Oracle
databases and application server software to track this data and to
schedule computational work.
By early 2004, Cern plans to use the Real Application Clusters
feature to provide high levels of availability on Cern's
grid.
"If the application related to the grid is unavailable, the grid
halts and you could lose all the computational work queued up to
run over the grid," said Shiers.
Shiers also highlighted the improvements Oracle has made in
processing "native numbers", which are used heavily in the
scientific community. "The Oracle database was always efficient
when dealing with figures such as $99.99, but it did not perform
when handling calculations," he said.
Shiers would like those numbers to be stored in the database with
no overheads, but if 100Pbytes of data were stored, an overhead
with a factor of two would double the storage requirements.
The Oracle database has also been used to decommission the previous
particle accelerator. "Every single component, every magnet, every
vacuum tube and every nut and bolt needs to be logged in the
database," he said. Moreover, Shiers said the components need to be
logged permanently. "We are not sure how to do this," he said. "It
is quite a demanding application."