A distantly supervised Grammatical Error Detection/Correction system for Swedish
Keywords:distant supervision, grammatical error correction, machine translation, Swedish
AbstractThis paper presents our submission to the first Shared Task on Multilingual Grammatical Error Detection (MultiGED-2023). Our method utilizes a transformer-based sequence to-sequence model, which was trained on a synthetic dataset consisting of 3.2 billion words. We adopt a distantly supervised approach, with the training process relying exclusively on the distribution of language learners’ errors extracted from the annotated corpus used to construct the training data. In the Swedish track, our model ranks fourth out of seven submissions in terms of the target F0.5 metric, while achieving the highest precision. These results suggest that our model is conservative yet remarkably precise in its predictions.
Copyright (c) 2023 Murathan Kurfalı, Robert Östling
This work is licensed under a Creative Commons Attribution 4.0 International License.