A distantly supervised Grammatical Error Detection/Correction system for Swedish

Authors

  • Murathan Kurfalı
  • Robert Östling

DOI:

https://doi.org/10.3384/ecp197004

Keywords:

distant supervision, grammatical error correction, machine translation, Swedish

Abstract

This paper presents our submission to the first Shared Task on Multilingual Grammatical Error Detection (MultiGED-2023). Our method utilizes a transformer-based sequence to-sequence model, which was trained on a synthetic dataset consisting of 3.2 billion words. We adopt a distantly supervised approach, with the training process relying exclusively on the distribution of language learners’ errors extracted from the annotated corpus used to construct the training data. In the Swedish track, our model ranks fourth out of seven submissions in terms of the target F0.5 metric, while achieving the highest precision. These results suggest that our model is conservative yet remarkably precise in its predictions.

Downloads

Published

2023-05-16