MultiGED-2023 shared task at NLP4CALL: Multilingual Grammatical Error Detection

Elena Volodina; Christopher Bryant; Andrew Caines; Orphée De Clercq; Jennifer-Carmen Frey; Elizaveta Ershova; Alexandr Rosen; Olga Vinogradova

doi:10.3384/ecp197001

MultiGED-2023 shared task at NLP4CALL: Multilingual Grammatical Error Detection

Authors

Elena Volodina
Christopher Bryant
Andrew Caines
Orphée De Clercq
Jennifer-Carmen Frey
Elizaveta Ershova
Alexandr Rosen
Olga Vinogradova

DOI:

https://doi.org/10.3384/ecp197001

Keywords:

Grammatical Error Detection (GED), Multilingual systems, Second language datasets for Czech English German Italian Swedish, Computational SLA, the Mathew effect

Abstract

This paper reports on the NLP4CALL shared task on Multilingual Grammatical Error Detection (MultiGED-2023), which included five languages: Czech, English, German, Italian and Swedish. It is the first shared task organized by the Computational SLA1 working group, whose aim is to promote less represented languages in the fields of Grammatical Error Detection and Correction, and other related fields. The MultiGED datasets have been produced based on second language (L2) learner corpora for each particular language. In this paper we introduce the task as a whole, elaborate on the dataset generation process and the design choices made to obtain MultiGED datasets, provide details of the evaluation metrics and CodaLab setup. We further briefly describe the systems used by participants and report the results.

Downloads

Published

2023-05-16

Issue

Proceedings of the 12th Workshop on Natural Language Processing for Computer Assisted Language Learning (NLP4CALL 2023)

Section

Contents

License

This work is licensed under a Creative Commons Attribution 4.0 International License.