MultiGED-2023 shared task at NLP4CALL: Multilingual Grammatical Error Detection

Authors

  • Elena Volodina
  • Christopher Bryant
  • Andrew Caines
  • Orphée De Clercq
  • Jennifer-Carmen Frey
  • Elizaveta Ershova
  • Alexandr Rosen
  • Olga Vinogradova

DOI:

https://doi.org/10.3384/ecp197001

Keywords:

Grammatical Error Detection (GED), Multilingual systems, Second language datasets for Czech English German Italian Swedish, Computational SLA, the Mathew effect

Abstract

This paper reports on the NLP4CALL shared task on Multilingual Grammatical Error Detection (MultiGED-2023), which included five languages: Czech, English, German, Italian and Swedish. It is the first shared task organized by the Computational SLA1 working group, whose aim is to promote less represented languages in the fields of Grammatical Error Detection and Correction, and other related fields. The MultiGED datasets have been produced based on second language (L2) learner corpora for each particular language. In this paper we introduce the task as a whole, elaborate on the dataset generation process and the design choices made to obtain MultiGED datasets, provide details of the evaluation metrics and CodaLab setup. We further briefly describe the systems used by participants and report the results.

Downloads

Published

2023-05-16