To make data archives that last is an urgent task. That’s the message of the European Commission’s eArchiving initiative, which has just announced version 2.0 of its architecture and that its funding has been renewed for another two years.

Under the tutelage of the commission, the initiative will define processes – using open formats and metadata – that mean organisations won’t have to keep old IT equipment hanging around just in case they need it to read old data.

“There are a number of problems when you want to restore very old data,” said Gregor Završnik, a researcher at the University of Ljubljana in Slovenia, who is a consultant in geospatial data archiving and a member of the eArchiving initiative. “For sure, you have to be able read the storage media and read the file format – but there is worse. When you have finally extracted data from an Excel table, you don’t have the context.

“So, you don’t know what the numbers you have restored correspond to. How were they collected? With what level of precision? Are they authentic?” he added, when talking to French sister site LeMagIT during a recent IT Press Tour event.

The eArchiving initiative builds on the E-Ark project, which is a community of developers that has worked since 2014 to create universal and perennial tools to validate, reformat and archive data. The key challenge is to make archives interoperable via common encoding but also to conforming to regulatory needs.

From researcher project to European initiative “At the start of E-Ark, we imagined we’d create a universal format for archiving,” said Završnik. “But as we progressed, we realised these archives are mostly kept by those who created the data originally, and that everyone thinks that this data will be commercially valuable even way in the future. So, what we need is to create a standard that allows an enterprise to restore its own archives after several years.” A key challenge, however, has been that the E-Ark project has struggled to bring together the big players in storage and backup. It is made up of a dozen teams, but these are overwhelmingly from the world of research. The challenge at the level of the European Commission is that to transform E-Ark into eArchiving, the technical content of the project needs to become an accepted standard in the market. A key early stage is that the universal archive format imagined by E-Ark is standardised and will correspond to the new revision of ISO 14721, the reference model for an open archival information system. “If the commission demands that the public sector in the EU adopts our archive format, it can’t oblige enterprises to do the same,” said Završnik. “But it can say to them that if they use an open format, they won’t be locked in for eternity to a technology that necessitates use of commercial tools. And what’s more, it will allow free exchange of data between each other.”