News

UK’s Arkivum up for €4.8m EU science data archive project

CERN-led multi-petabyte scale research data archive bid will use Google cloud and compete with AWS-based and open source designs from other providers Libnova and Onedata

Antony Adshead, Storage Editor

Published: 03 Mar 2021 13:59

UK-based digital archiving provider Arkivum is one of three providers selected for a €4.8m second prototyping phase of the European multinational Archiver project, which aims to provide petabyte-scale data archiving and preservation for its scientific partners.

The project is led by CERN – home of the Large Hadron Collider near Geneva – and also comprises DESY (Deutsches Elektronen-Synchrotron), EMBL-EBI (European Bioinformatics Institute) and PIC (Port d’Informació Científic).

The Archiver project aims to provide petabyte-scale storage for a wide variety of research and analytics use cases for the scientific partners involved, said João Fernandes, Archiver project leader from CERN’s IT department.

The scalability of the technology is a high priority because the expected eventual capacity will be in the tens of petabytes. In the prototype phase of the project, the system will ingest data at rates of up to 100TB a day.

This second phase is worth €4.8m and will last eight months. Archiver is co-funded by the European Union’s Horizon 2020 research and innovation programme.

“It’s not just about storing bits, but also about intellectual control of the data, so preserving what has been done with the data previously, who by, and keeping the documentation and the software,” said Fernandes.

Those requirements are summed up in the FAIR principles – Findable, Accessible, Interoperable and Reusable – so that experiments can be reproduced and continued long after they were last worked on, if needed.

The time between experiments can be lengthy, so data has significant long-term value and needs to remain active and accessible in the archive for possibly decades after a research project has ended. At present, custom-built databases for handling complex and sometimes sensitive datasets present barriers to researchers uploading and downloading data.

UK’s Arkivum up for €4.8m EU science data archive project

CERN-led multi-petabyte scale research data archive bid will use Google cloud and compete with AWS-based and open source designs from other providers Libnova and Onedata

Read more on data analytics and storage

Read more on Data centre hardware

Interview: Data processing for particle physics at Cern

Storage technology explained: Key questions about tape storage

Developed for Big Science in Europe, highly accurate time protocols ensure fair trading in finance

petabyte