Sailing through multiword expression identification with Wiktionary and Linguse: A case study of language learning

Authors

  • Till Überrück-Fries Université Paris-Saclay, CNRS, LISN
  • Agata Savary Université Paris-Saclay, CNRS, LISN
  • Agnieszka Dryja´nska University of Warsaw, Institute of Romance Studies

DOI:

https://doi.org/10.3384/ecp211019

Keywords:

MWE identification, language learning, Wiktionary

Abstract

Multiword expressions (MWEs), due to their idiomatic nature, pose particular challenges in comprehension tasks and vocabulary acquisition for language learners. Current NLP tools fall short off comprehensively aiding language learners when encountering MWEs. While proficient in identifying MWEs seen during training, current systems are constrained by limited training data. To address the specific needs of language learners, this research integrates expansive MWE lexicons and NLP methodologies as championed by Savary et al. (2019a). Outcomes encompass a specialized MWE corpus from Wiktionary, the enhancement of Linguse, a reading application for language learners, with MWE annotations, and empirical validation with French language students. The culmination is an MWE identifier optimally designed for language learner requirements.

Downloads

Published

2024-10-15