An active approach to maintaining digital archives
14 June 2013
Archiving has long been considered a passive process: put the things you want to keep in a cool, dry place and forget about them until needed.
However, in the digital era, in which photos, videos, documents and other content are on hard drives, flash disks or on servers in ‘the cloud’ rather than in boxes in a cellar, archiving requires a much more active approach. EU-funded researchers are addressing the problem.
Everybody in one way or another is an archivist. Companies and public administrations need to keep records going back years, media organisations have photos and videos they want to store and reuse, museums try to archive all manner of content for posterity, and almost everyone these days has large personal collections of multimedia content on their hard drive.
In many ways, digital content is thought to be more secure and enduring than analogue materials: a digital photo on a hard drive does not degrade over time like a printed image stored in a box in the attic. But that does not mean it cannot be corrupted, changed or lost entirely.
‘While we understand well the chemical processes involved in analogue degradation, the issues with digital archiving are totally different,’ notes Daniel Teruggi, a composer, researcher and head of research at the Ina EXPERT directorate of France’s Institut National Audiovisuel - dedicated to audiovisual and multimedia education, training and research.
Because digital content is essentially just a sequence of numbers, the slightest change to one digit can have dramatic effects on everything from quality to accessibility. This can happen for a variety of reasons: so-called ‘bit rot’ in a hard drive when it starts to lose some of its magnetic properties, a software system or hardware change, or inadvertent modification by someone accessing the archive.
‘Another major factor is compression. Say you have a 20 megabyte file and compress that to 2 megabytes, if something goes wrong during the compression or upon accessing it, moving it or storing it, the consequences can be huge,’ Dr Teruggi says. ‘At present the processes involved in digital archiving are far from perfect.’
Dr Teruggi and a team from 14 organisations from six countries have been addressing those issues and others in the PRESTOPRIME project. A four-year initiative supported by Euro 8 million in funding from the European Commission, the consortium spans the full range of archive users and multimedia content researchers, from museums and broadcasters to technology companies, R&D institutes and universities.
Together, they have created a suite of innovative, open source tools to help archives of any size manage and monitor their content, analyse risks to its long-term preservation, verify and assure its integrity, and do so while helping archivists understand the costs involved in terms of both time and money.
‘Archiving in the digital era can no longer be a passive process, it requires an active approach. Archived content needs to be analysed, monitored and checked regularly to ensure its integrity and long-term preservation,’ Dr Teruggi, who coordinated PRESTOPRIME, explains.
Keeping tabs on digital content
The PRESTOPRIME team’s approach is somewhat akin to modern track and trace systems used to monitor food in storage and transport - so-called ‘farm-to-fork traceability’. It focuses on helping archivists automatically know what is stored where and what condition it is in.
‘While there is certainly some similarity with food monitoring technologies, we face an added complexity. If food rots it smells. With digital content you have no straightforward way of knowing if it has gone bad, possibly for a very long time, maybe long after something can be done about it,’ the PRESTOPRIME coordinator notes.
Tools developed in the PRESTOPRIME project enable the automated checking of archived content, while also helping archivists to estimate the risks of moving or modifying it in some way, such as a computer system or storage upgrade.
‘In the digital era nothing is static. Systems and storage devices change every few years, and every change represents a risk to the integrity of archived content,’ Dr Teruggi notes.
The team also focused not just on the content itself but also its associated metadata - information about where and when a photo was taken, who took it and what it shows, for example. They developed tools to simplify the inclusion of metadata in content that lacks such information and to integrate different sets of metadata applied to the same content - a process known as ‘metadata mapping’.
‘Metadata is very important for archived content. Think about it like storing a box of photos in the attic. You know who the photos are of - and where and when they were taken - but if years later your grandchildren find them, they will probably have no idea. Metadata applied to digital content provides a way of preserving that information while also making managing and organising the content itself much simpler,’ Dr Teruggi says.
Most of the PRESTOPRIME tools are already available as open source applications and can be downloaded and used by anyone. They also form a key part of the activities of the PrestoCentre, an organisation set up by the project partners and now incorporating a range of other organisations dedicated to advancing research and developing solutions for digital archives.
PrestoCentre organises conferences and workshops to help organisations from small local museums to major international media groups improve their digital archiving processes, and has provided significant contributions to international standardisation activities.
‘The feedback we have received from PrestoCentre members about the tools has been extremely positive. In addition, one of the project partners, Ex Libris, which usually works with library archives, has started to use the tools commercially, as it looks to provide solutions in the audio-visual domain,’ the project coordinator notes.
The project partners have since gone on to launch the ‘European Technology for Digital Audiovisual Media Preservation’ (PRESTO4U) project, also with the support of the European Commission, in which they are analysing the disparate needs of different user communities.
‘In the future, we would also like to develop tools for individuals, so people at home can better manage and preserve their photos, videos and other content,’ Dr Teruggi says.