Evaluating the Generalisation of an Artificial Learner
DOI:
https://doi.org/10.3384/ecp211015Keywords:
LLM, Learner Simulation, NLPAbstract
This paper focuses on the creation of LLM-based artificial learners. Motivated by the capability of language models to encode language representation, we evaluate such models in predicting masked tokens in learner corpora. We pre-trained two learner models, one in a training set of the EFCAMDAT (natural learner model) and another in the C4200m dataset (syntehtic learner model), evaluating them against a native model using an external corpora of English for Specific purposes corpus of French undergraduates (CELVA) as test set. We measured metrics related to accuracy, consistency and divergence. While the native model performs reasonably well, the natural learner pre-trained model show improvements token in recall at k. We complement the accuracy metric showing that the native language model make "over-confident" mistakes where our artificial learners make mistakes where probabilities are uniform. Finally we show that the general tokens choices from the native model diverges from the natural learner model and that this divergence is higher on lower proficiency levels.