Ravensbrück Interviews: How to Curate Legacy Data to Make it CLARIN Compliant


  • Silvia Calamai
  • Stefania Scagliola
  • Fabio Ardolino
  • Christoph Draxler
  • Arjan van Hessen
  • Henk van den Heuvel




Legacy Data, Oral Archives, CLARIN Resource Families, Metadata, Transcription Chain


This paper describes the preparatory phase of a CLARIN-funded project called ‘Voices from Ravensbrück’, which aims to introduce a new type of corpus in the CLARIN resource family called ‘Oral Histories’. The first task consisted in curating and transcribing a set of interviews conducted by the Italian author A.M. Bruzzone with five Italian survivors of the Ravensbrück concentration camp back in 1977. This posed considerable challenges inherent in integrating legacy data from the pre-digital era in the CLARIN infrastructure. The second task was exploring the potential of automatic speech transcription for this type of oral history data. The third element of this exploratory phase was identifying potential partners and suitable data for creating a mul-tilingual collection of existing oral history interviews with survivors of concentration camp Ravensbrück. These preparatory steps were necessary to move to the final phase of our project and realise our overall objective of creating a resource family compliant with CLARIN standards and enabling scholars to analyse interviews from a comparative multilingual and multidisciplinary perspective.