D9.1 Characterisation technology, Release 1 & release report

This report sums up the work done during the first year of the SCAPE project in the Characterisation Components work package of the Preservation Components sub projects. The report documents three commonly used tools from the Digital Preservation Community: DROID, Apache Tika™, and FIDO. The tools have been evaluated against the GovDoc1 document corpora and the enclosed ground truth. The report also documents the amount of work that has been carried out to extend especially two of the selected tools, namely Apache Tika™ and DROID. Finally a road map for the next year is laid out. No conclusions regarding the quality of the tools are presented in this report.