SCAPE at the IS&T Archiving conference

archiving2013

IS&T Archiving conference
2-5 April 2013, Washington DC, USA
http://www.imaging.org/ist/conferences/archiving/

SCAPE will participate in this large-scale event with three research papers:

  • Scalable Preservation Decisions: A Controlled Case Study”, by Christoph Becker and Hannes Kulovits (Vienna University of Technology, Austria); and Bjarne Andersen (State and University Library, Denmark). Session: Digital Preservation II, 4 April.
    Abstract: This article reports on a systematic, controlled case study where a preservation plan is created for a large collection of audio material using the planning tool Plato 3, and takes this study as a starting point to assess the state of art in scaling up decision making processes in preservation planning. We report on the effort required for specific preservation planning activities such as the evaluation of potential preservation actions and the specification of evaluation criteria. We compare this with a number of improvements to the planning process and tool that are developed as part of the SCAPE project, analyze the status of improvements and the limits of automation, and outline future steps towards scalable preservation decisions.
  • Guidelines for Legacy Repository Migration”, by Miguel Ferreira and Luis Faria, (Keep Solutions) and José Carlos Ramalho, University of Minho. Session: Standards & Guidelines, 5 April
    Abstract: Several institutions are currently running long-term digital repositories that have been in operation for several years now. Some of these systems are approaching the end of their life spans and will soon be replaced by the next-generation of long-term digital repository systems. This will unavoidably imply the migration of millions of files, metadata records and terabytes of data from the legacy repository to the newly adopted one. Because of the large scale of this operation, this procedure needs careful planning, validation and support.
    This paper provides guidelines to support the migration from legacy repository systems by describing the stages, activities and associated risks that comprise this type of endeavor. The presented guidelines are based on a combination of 13 existing methodologies that have been surveyed and unified into a comprehensive multistep methodology.
  •  “An Open Source Infrastructure for Quality Assurance and Preservation of a Large Digital Book Collection”, by Sven Schlarb (Austrian National Library). Session: Quality Management, 5 April
    Abstract: Within the SCAPE project context, we will present an open source infrastructure for preserving large collections of digital objects created at the Austrian National Library for quality assurance tasks as part of the management of a large digital book collection. We describe the experimental cluster hardware and the software components used for creating the infrastructure. More concretely, we will show a set of best practices for the data analysis of large document image collections on the basis of Apache Hadoop.
    We are focusing on two institutional scenarios: migrating TIFF master image files to compressed JPEG2000 image files and comparing different derivatives of the same intellectual book entity based on the comparison of book page images and an aggregated similarity measure. In both scenarios we are facing the challenge that the processing is I/O intensive due to the need of loading large data sets, and it is CPU intensive due to the use of quality assurance software operating on image data.
    Finally, we will show the results of the evaluation of the proposed infrastructure by using data sets from the digital book and collection of the Austrian National Library from the 16th to the 19th century.

Leave a Reply