Selected Papers from the CLARIN Annual Conference 2021

					View Selected Papers from the CLARIN Annual Conference 2021

This volume presents the highlights of the 10th CLARIN Annual Conference 2021. The conference was held on 27th —29th September 2021 and because of the COVID-19 pandemic, for the second year in row a virtual format had te be adopted.

CLARIN, the Common Language Resources and Technology Infrastructure, is a virtual platform that is accessible for everyone interested in language. CLARIN offers access to language resources, technology, and knowledge, and enables cross-country collaboration among academia, industry, policy-makers, cultural institutions, and the general public. Researchers, students, and citizens are offered access to digital language resources and technology services to deploy, connect, analyse and sustain such resources. In line with the Open Science agenda, CLARIN enables scholars from the Social Sciences and Humanities (SSH) and beyond to engage in and contribute to cutting-edge, data-driven research based on language data in a range of formats and modalities.

Series: Linköping Electronic Conference Proceedings 189
Editors: Monica Monachini and Maria Eskevich
ISBN: 978-91-7929-444-1
ISSN: 1650-3686 (print), 1650-3740 (online)

Published: 2022-07-08

Contents

  • Ravensbrück Interviews: How to Curate Legacy Data to Make it CLARIN Compliant

    Silvia Calamai, Stefania Scagliola, Fabio Ardolino, Christoph Draxler, Arjan van Hessen, Henk van den Heuvel
    1-9
    DOI: https://doi.org/10.3384/ecp1891
  • Italian Language Resources. From CLARIN-IT to the VLO and Back: Sketching a Methodology for Monitoring LRs Visibility

    Dario Del Fante, Francesca Frontini, Monica Monachini, Valeria Quochi
    10-22
    DOI: https://doi.org/10.3384/ecp1892
  • The Nature of Icelandic as a Second Language: An Insight from the Learner Error Corpus for Icelandic

    Isidora Glisic, Anton Karl Ingason
    23-33
    DOI: https://doi.org/10.3384/ecp1893
  • The TEI-based ISO Standard ‘Transcription of spoken language’as an Exchange Format within CLARIN and beyond

    Hanna Hedeland, Thomas Schmidt
    34-45
    DOI: https://doi.org/10.3384/ecp1894
  • CLARIN Knowledge Centre for Belarusian Text and Speech Processing (K-BLP)

    Yuras Hetsevich, Jauheniya Zianouka, David Latyshevichg, Mikita Suprunchuk, Valer Varanovich
    46-55
    DOI: https://doi.org/10.3384/ecp1895
  • Curation Criteria for Multimodal and Multilingual Data: a Mixed Study within the QUEST Project

    Amy Isard, Elena Arestau
    56-67
    DOI: https://doi.org/10.3384/ecp1896
  • Legal Issues Related to the Use of Twitter Data in Language Research

    Pawel Kamocki, Vanessa Hannesschläger, Esther Hoorn, Aleksei Kelli, Marc Kupietz, Krister Lindén, Andrius Puksas
    68-75
    DOI: https://doi.org/10.3384/ecp1897
  • The Interaction of Personal Data, Intellectual Property and Freedom of Expression in the Context of Language Research

    Aleksei Kelli, Krister Lindén, Pawel Kamocki, Kadri Vider, Penny Labropoulou, Ramūnas Birštonas, Vadim Mantrov, Vanessa Hannesschläger, Riccardo Del Gratta, Age Värv, Gaabriel Tavits, Andres Vutt, Esther Hoorn, Jan Hajic Charles, Arvi Tavast
    76-87
    DOI: https://doi.org/10.3384/ecp1898
  • Collaborating on Language Resource Infrastructures with Non-Research Partners: Practicalities and Challenges

    Verena Lyding, Egon Stemle, Alexander K¨onig
    88-100
    DOI: https://doi.org/10.3384/ecp1899
  • Annotation Management Tool: A Requirement for Corpus Construction

    Yousuf Ali Mohammed, Arild Matsson, Elena Volodina
    101-108
    DOI: https://doi.org/10.3384/ecp18910
  • Help Yourself from the Buffet: National Language Technology Infrastructure Initiative on CLARIN-IS

    Anna Björk Nikulásdóttir, Þórunn Arnardóttir, Starkaður Barkarson, Jón Guðnason, Þorsteinn Daði Gunnarsson, Anton Karl Ingason, Haukur Páll Jónsson, Hrafn Loftsson, Hulda Óladóttir, Eiríkur Rögnvaldsson, Einar Freyr Sigurðsson, Atli Þór Sigurgeirsson, Vésteinn Snæbjarnarson, Steinþór Steingrímsson, Gunnar Thor Örnólfsson
    109-125
    DOI: https://doi.org/10.3384/ecp18911
  • Building of Parallel and Comparable Cybersecurity Corpora for Bilingual Terminology Extraction

    Andrius Utka, Sigita Rackevičienė, Aivaras Rokas, Liudmila Mockienė, Marius Laurinaitis, Agnė Bielinskienė
    126-138
    DOI: https://doi.org/10.3384/ecp18912
  • ‘Cretan Institutional Inscriptions’ Meets CLARIN-IT

    Irene Vagionakis, Riccardo Del Gratta, Federico Boschetti, Paola Baroni, Angelo Mario Del Grosso, Tiziana Mancinelli, Monica Monachini
    139-150
    DOI: https://doi.org/10.3384/ecp18913
  • Reliability of Automatic Linguistic Annotation: Native vs Non-native Texts

    Elena Volodina, David Alfter, Therese Lindström Tiedemann, Maisa Lauriala, Daniela Piipponen
    151-167
    DOI: https://doi.org/10.3384/ecp18914
  • Flexible Metadata Schemes for Research Data Repositories.The Common Framework in Dataverse and the CMDI Use Case

    Jerry de Vries, Vyacheslav Tykhonov, Andrea Scharnhorst, Eko Indarto, Femmy Admiraal, Mike Priddy
    168-180
    DOI: https://doi.org/10.3384/ecp18915
  • Bagman – A Tool that Supports Researchers Archiving Their Data

    Claus Zinn
    181-189
    DOI: https://doi.org/10.3384/ecp18916
  • ARCHE Suite: A Flexible Approach to Repository Metadata Management

    Mateusz Żółtak, Martina Trognitz, Matej Ďurčo
    190-199
    DOI: https://doi.org/10.3384/ecp18917