Out-of-the-Box Graded Vocabulary Lists with Generative Language Models: Fact or Fiction?

Authors

  • David Alfter Gothenburg Research Infrastructure in Digital Humanities (GRIDH), University of Gothenburg, Sweden

DOI:

https://doi.org/10.3384/ecp211001

Keywords:

large language models, graded vocabulary

Abstract

In this paper, we explore the zero-shot classification potential of generative language models for the task of grading vocabulary and generating graded vocabulary lists. We expand upon prior research by testing five different language model families on five different languages. Our results indicate that generative models can grade vocabulary across different languages with moderate but stable success, but producing vocabulary in a language other than English seems problematic and often leads to the generation of non-words, or words in a language other than the target language.

Downloads

Published

2024-10-15