MWE-Finder: An evaluation through three case studies

Authors

  • Martin Kroon Institute for Language Sciences, University of Utrecht, the Netherlands
  • Jan Odijk Institute for Language Sciences, University of Utrecht, the Netherlands

DOI:

https://doi.org/10.3384/ecp210009

Abstract

In this paper we showcase and evaluate MWE-Finder, a system that allows users to search for occurrences of an MWE in a large Dutch text corpus. To this end, we conduct three small case studies, and discuss the results in detail. We make use of the MWEs 0geen *+haan zal naar iets kraaien ‘no one will say anything about something’, iemand zal 0dat *+varken wassen ‘someone will deal with that problem’ and iemand zal iemand het hemd van het lijf vragen ‘someone will want to know all the ins and outs of something from someone’, which are all in canonical form following Odijk (2023) and Odijk and Kroon (2024).

The results show that MWE-Finder is very accurate in retrieving the target MWEs, reaching an accuracy of 93.7%, and an F1-score of 95.2%. The case studies additionally lay bare points of improvement of MWE-Finder, specifically concerning the enrichment of syntactic parses by making the object relation explicit in certain constructions.

Downloads

Published

2024-07-09