XSL-HoReCo and GoSt-ParC-Sign: Two New Signed Language - Written Language Parallel Corpora


  • Mirella De Sisto Tilburg University, the Netherlands
  • Vincent Vandeghinste Instituut voor de Nederlandse Taal Leiden, the Netherlands and KU Leuven, Belgium
  • Caro Brosens Vlaamse GebarentaalCentrum Antwerp, Belgium
  • Myriam Vermeerbergen KU Leuven, Belgium
  • Dimitar Shterionov Tilburg University, the Netherlands




Developments in language technology targeting signed languages are lagging behind in comparison to the advances related to what is available for so-called spoken languages.1 This is partly due to the scarcity of good quality signed language data, including good quality parallel corpora of signed and spoken languages. This paper introduces two parallel corpora which aim at reducing the gap between signed and spoken-only language technology: The XSL Hotel Review Corpus (XSL-HoReCo) and the Gold Standard Parallel Corpus of Signed and Spoken Language (GoSt-ParC-Sign). Both corpora are available through the CLARIN infrastructure.