https://ecp.ep.liu.se/index.php/sltc/issue/feed Swedish Language Technology Conference and NLP4CALL 2023-05-16T09:16:58+02:00 Open Journal Systems <p>Papers are invited on all theoretical, practical and applied aspects of language technology, including natural language processing, computational linguistics, speech technology and neighbouring areas. Papers can describe completed or ongoing research, as well as practical applications of language technology, and may be combined with system demonstrations.</p> https://ecp.ep.liu.se/index.php/sltc/article/view/677 MultiGED-2023 shared task at NLP4CALL: Multilingual Grammatical Error Detection 2023-05-16T08:44:11+02:00 Elena Volodina Christopher Bryant Andrew Caines Orphée De Clercq Jennifer-Carmen Frey Elizaveta Ershova Alexandr Rosen Olga Vinogradova This paper reports on the NLP4CALL shared task on Multilingual Grammatical Error Detection (MultiGED-2023), which included five languages: Czech, English, German, Italian and Swedish. It is the first shared task organized by the Computational SLA1 working group, whose aim is to promote less represented languages in the fields of Grammatical Error Detection and Correction, and other related fields. The MultiGED datasets have been produced based on second language (L2) learner corpora for each particular language. In this paper we introduce the task as a whole, elaborate on the dataset generation process and the design choices made to obtain MultiGED datasets, provide details of the evaluation metrics and CodaLab setup. We further briefly describe the systems used by participants and report the results. 2023-05-16T00:00:00+02:00 Copyright (c) 2023 Elena Volodina, Christopher Bryant, Andrew Caines, Orphée De Clercq, Jennifer-Carmen Frey, Elizaveta Ershova, Alexandr Rosen, Olga Vinogradova https://ecp.ep.liu.se/index.php/sltc/article/view/678 NTNU-TRH system at the MultiGED-2023 Shared on Multilingual Grammatical Error Detection 2023-05-16T08:44:11+02:00 Lars Bungum Björn Gambäck Arild Brandrud Næss The paper presents a monolithic approach to grammatical error detection, which uses one model for all languages, in contrast to the individual approach, which creates separate models for each language. For both approaches, pre-trained embeddings are the only external knowledge sources. Two sets of embeddings (Flair and BERT) are compared as well as two approaches to the problem of multilingual rammar detection, building individual and monolithic systems for multilingual grammar error detection. The system submitted to the test phase of the MultiGED-2023 shared task ranked 5th of 6 systems. In the subsequent open phase, more experiments were conducted, improving results. These results show the individual models to perform better than the monolithic ones and BERT embeddings working better than Flair embeddings for the individual models, while the picture is more mixed for the monolithic models. 2023-05-16T00:00:00+02:00 Copyright (c) 2023 Lars Bungum, Björn Gambäck, Arild Brandrud Næss https://ecp.ep.liu.se/index.php/sltc/article/view/679 EliCoDe at MultiGED2023: fine-tuning XLM-RoBERTa for multilingual grammatical error detection 2023-05-16T08:44:12+02:00 Davide Colla Matteo Delsanto Elisa Di Nuovo In this paper we describe the participation of our team, ELICODE, to the first shared, task on Multilingual Grammatical Error Detection, MultiGED, organised within the workshop series on Natural Language Processing for Computer-Assisted Language Learning (NLP4CALL). The multilingual shared task includes five languages: Czech, English, German, Italian and Swedish. The shared task is tackled as a binary classification task at token level aiming at identifying correct or incorrect tokens in the provided sentences. The submitted system is a token classifier based on XLMRoBERTa language model. We fine-tuned five different models—one per each language in the shared task. We devised two different experimental settings: first, we trained the models only on the provided training set, using the development set to select the model achieving the best performance across the training epochs; second, we trained each model jointly on training and development sets for 10 epochs, retaining the 10-epoch fine-tuned model. Our submitted systems, evaluated using F0.5 score, achieved the best performance in all evaluated test sets, except for the English REALEC data set (second classified). Code and models are publicly available at https://github.com/davidecolla/EliCoDe. 2023-05-16T00:00:00+02:00 Copyright (c) 2023 Davide Colla, Matteo Delsanto, Elisa Di Nuovo https://ecp.ep.liu.se/index.php/sltc/article/view/680 A distantly supervised Grammatical Error Detection/Correction system for Swedish 2023-05-16T08:44:13+02:00 Murathan Kurfalı Robert Östling This paper presents our submission to the first Shared Task on Multilingual Grammatical Error Detection (MultiGED-2023). Our method utilizes a transformer-based sequence to-sequence model, which was trained on a synthetic dataset consisting of 3.2 billion words. We adopt a distantly supervised approach, with the training process relying exclusively on the distribution of language learners’ errors extracted from the annotated corpus used to construct the training data. In the Swedish track, our model ranks fourth out of seven submissions in terms of the target F0.5 metric, while achieving the highest precision. These results suggest that our model is conservative yet remarkably precise in its predictions. 2023-05-16T00:00:00+02:00 Copyright (c) 2023 Murathan Kurfalı, Robert Östling https://ecp.ep.liu.se/index.php/sltc/article/view/681 Two Neural Models for Multilingual Grammatical Error Detection 2023-05-16T08:44:13+02:00 Phuong Le-Hong The Quyen Ngo Thi Minh Huyen Nguyen This paper presents two neural models for multilingual grammatical error detection and their results in the MultiGED-2023 shared task. The first model uses a simple, purely supervised character-based approach. The second model uses a large language model which is pretrained on 100 different languages and fine-tuned on the provided datasets of the shared task. Despite simple approaches, the two systems achieved promising results. One system has the second best F-score; the other is in the top four of participating systems. 2023-05-16T00:00:00+02:00 Copyright (c) 2023 Phuong Le-Hong, The Quyen Ngo, Thi Minh Huyen Nguyen https://ecp.ep.liu.se/index.php/sltc/article/view/682 Experiments on Automatic Error Detection and Correction for Uruguayan Learners of English 2023-05-16T08:44:14+02:00 Romina Brown Santiago Paez Gonzalo Herrera Luis Chiruzzo Aiala Rosá This paper presents an initial experiment on Grammatical Error Correction and Automatic Grading for short texts written by Uruguayan students that are learning English. We present a set of error detection and correction heuristics, and some experiments on using these heuristics for predicting the grade. Although our experiments are limited due to the nature of the dataset, they are a good proof of concept with promising results that might be extended in the future. 2023-05-16T00:00:00+02:00 Copyright (c) 2023 Romina Brown, Santiago Paez, Gonzalo Herrera, Luis Chiruzzo, Aiala Rosá https://ecp.ep.liu.se/index.php/sltc/article/view/683 Sequence Tagging in EFL Email Texts as Feedback for Language Learners 2023-05-16T08:44:14+02:00 Yuning Ding Ruth Trüb Johanna Fleckenstein Stefan Keller Andrea Horbach When predicting scores for different aspects of a learner text, automated scoring algorithms usually cannot provide information about which part of text a score is referring to. We therefore propose a method to automatically segment learner texts as a way towards providing visual feedback. We train a neural sequence tagging model and use it to segment EFL email texts into functional segments. Our algorithm reaches a token-based accuracy of 90% when trained per prompt and between 83 and 87% in a cross-prompt scenario. 2023-05-16T00:00:00+02:00 Copyright (c) 2023 Yuning Ding, Ruth Trüb, Johanna Fleckenstein, Stefan Keller, Andrea Horbach https://ecp.ep.liu.se/index.php/sltc/article/view/684 Speech Technology to Support Phonics Learning for Kindergarten Children at Risk of Dyslexia 2023-05-16T08:44:15+02:00 Stine Fuglsang Engmose Peter Juel Henrichsen We present the AiRO learning environment for kindergarten children at risk of developing dyslexia. The AiRO frontend, easy to use for pupils down to 5 years old, introduces each spelling task with pictural and auditive cues. AiRO responds to spelling attempts with phonetic renderings (synthetic voice). Below, we introduce the didactic and technical principles behind AiRO before presenting our first experiment with 50 kindergarten pupils. Our subjects were pre- and post-tested on reading an spelling. After four weeks of AiRObased training the experimental group significantly out-performed the control group, suggesting that a new CALLbased pedagogical approach to prevent dyslexia for some children may be within reach. 2023-05-16T00:00:00+02:00 Copyright (c) 2023 Stine Fuglsang Engmose, Peter Juel Henrichsen https://ecp.ep.liu.se/index.php/sltc/article/view/685 On the relevance and learner dependence of co-text complexity for exercise difficulty 2023-05-16T08:44:15+02:00 Tanja Heck Detmar Meurers Adaptive exercise sequencing in Intelligent Language Tutoring Systems (ILTS) aims to select exercises for individual learners that match their abilities. For exercises practicing forms in isolation, it may be sufficient for sequencing to consider the form being practiced. But when exercises embed the forms in a sentence or bigger language context, little is known about how the nature of this co-text influences learners in completing the exercises. To fill the gap, based on data from two large field studies conducted with an English ILTS in German secondary schools, we analyze the impact of co-text complexity on learner performance for different exercise types and learners at different proficiency levels. The results show that co-text complexity is an important predictor for a learner’s performance on practice exercises, especially for gap filling and Jumbled Sentences exercises, and particularly for learners at higher proficiency levels. 2023-05-16T00:00:00+02:00 Copyright (c) 2023 Tanja Heck, Detmar Meurers https://ecp.ep.liu.se/index.php/sltc/article/view/686 Manual and Automatic Identification of Similar Arguments in EFL Learner Essays 2023-05-16T08:44:16+02:00 Ahmed Mousa Ronja Laarmann-Quante Andrea Horbach Argument mining typically focuses on identifying argumentative units such as claim, position, evidence etc. in texts. In an educational setting, e.g. when teachers grade students’ essays, they may in addition benefit from information about the content of the arguments being used. We thus present a pilot study on the identification of similar arguments in a set of essays written by English-as-a-foreignlanguage (EFL) students. In a manual annotation study, we show that human annotators are able to assign sentences to a set of 26 reference arguments with a rather high agreement of κ > .70. In a set of experiments based on (a) unsupervised clustering and (b) supervised machine learning, we find that both approaches perform rather poorly on this task, but can be moderately improved by using a set of six meta classes instead of the more finegrained argument distinction. 2023-05-16T00:00:00+02:00 Copyright (c) 2023 Ahmed Mousa, Ronja Laarmann-Quante, Andrea Horbach https://ecp.ep.liu.se/index.php/sltc/article/view/687 DaLAJ-GED - a dataset for Grammatical Error Detection tasks on Swedish 2023-05-16T08:44:16+02:00 Elena Volodina Yousuf Ali Mohammed Aleksandrs Berdicevskis Gerlof Bouma Joey Öhman DaLAJ-GED is a dataset for linguistic acceptability judgments for Swedish, covering five head classes: lexical, morphological, syntactical, orthographical and punctuation. DaLAJGED is an extension of DaLAJ.v1 dataset (Volodina et al., 2021a,b). Both DaLAJ datasets are based on the SweLL-gold corpus (Volodina et al., 2019) and its correction annotation categories. DaLAJ-GED presented here contains 44,654 sentences, distributed (almost) equally between correct and incorrect ones and is primarily aimed at linguistic acceptability judgment task, but can also be used for other tasks related to grammatical error detection (GED) on a sentence level. DaLAJ-GED is included into the Swedish SuperLim 2.0 collection, an extension of SuperLim (Adesam et al., 2020), a benchmark for Natural Language Understanding (NLU) tasks for Swedish. This paper gives a concise overview of the dataset and presents a few benchmark results for the task of linguistic acceptability, i.e. binary classification of sentences as either correct or incorrect. 2023-05-16T00:00:00+02:00 Copyright (c) 2023 Elena Volodina, Yousuf Ali Mohammed, Aleksandrs Berdicevskis, Gerlof Bouma, Joey Öhman https://ecp.ep.liu.se/index.php/sltc/article/view/688 Automated Assessment of Task Completion in Spontaneous Speech for Finnish and Finland Swedish Language Learners 2023-05-16T08:44:17+02:00 Ekaterina Voskoboinik Yaroslav Getman Ragheb Al-Ghezi Mikko Kurimo Tamas Grosz This study investigates the feasibility of automated content scoring for spontaneous spoken responses from Finnish and Finland Swedish learners. Our experiments reveal that pretrained Transformer-based models outperform the tf-idf baseline in automatic task completion grading. Furthermore, we demonstrate that pre-fine-tuning these models to differentiate between responses to distinct prompts enhances subsequent task completion finetuning. We observe that task completion classifiers exhibit accelerated learning and produce predictions with stronger correlations to human grading when accounting for task differences. Additionally, we find that employing similarity learning, as opposed to conventional classification fine-tuning, further improves the results. It is especially helpful to learn not just the similarities between the responses in one score bin, but the exact differences between the average human scores responses received. Lastly, we demonstrate that models applied to both manual and ASR transcripts yield comparable correlations to human grading. 2023-05-16T00:00:00+02:00 Copyright (c) 2023 Ekaterina Voskoboinik, Yaroslav Getman, Ragheb Al-Ghezi, Mikko Kurimo, Tamas Grosz