SCAPE at Hadoop Summit, March 2013

Hadoop Summit Europe
20-21 March 2013, Amsterdam, Netherlands

SCAPE has been accepted to participate in the upcoming Hadoop Summit, the leading conference for the Apache Hadoop community. Clemens Neudecker (National Library of the Netherlands) and Sven Schlarb (Austrian National Library) will present the paper ‘The Elephant in the Library’ in the Integrating Hadoop track (20 March, 15.30 – 16.30).

An interview with both presenters is available from the Hadoop website: Meet the Presenters.

Libraries collect books, magazines and newspapers. Yes, that’s what they always did. But today, the amount of digital information resources is growing at dizzying speed. Facing the demand of digital information resources available 24/7, there has been a significant shift regarding a library’s core responsibilities. Today’s libraries are curating large digital collections, indexing millions of full-text documents, preserving Terabytes of data for future generations, and at the same time exploring innovative ways of providing access to their collections. 

This is exactly where Hadoop comes into play. Libraries have to process a rapidly increasing amount of data as part of their day-to-day business and computing tasks like file format migration, text recognition, linguistic processing, etc., require significant computing resources. Many data processing scenarios emerge where Hadoop might become an essential part of the digital library’s ecosystem. Hadoop is sometimes referred to as a hammer where you have to throw away everything that is not a nail. To remain in that metaphor: we will present some actual use cases for Hadoop in libraries, how we determine what are the nails in a library and what not, and some initial results.

Leave a Reply