Automated Assessment of Task Completion in Spontaneous Speech for Finnish and Finland Swedish Language Learners


  • Ekaterina Voskoboinik
  • Yaroslav Getman
  • Ragheb Al-Ghezi
  • Mikko Kurimo
  • Tamas Grosz



automatic speech assessment, l2 speech evaluation, content scoring


This study investigates the feasibility of automated content scoring for spontaneous spoken responses from Finnish and Finland Swedish learners. Our experiments reveal that pretrained Transformer-based models outperform the tf-idf baseline in automatic task completion grading. Furthermore, we demonstrate that pre-fine-tuning these models to differentiate between responses to distinct prompts enhances subsequent task completion finetuning. We observe that task completion classifiers exhibit accelerated learning and produce predictions with stronger correlations to human grading when accounting for task differences. Additionally, we find that employing similarity learning, as opposed to conventional classification fine-tuning, further improves the results. It is especially helpful to learn not just the similarities between the responses in one score bin, but the exact differences between the average human scores responses received. Lastly, we demonstrate that models applied to both manual and ASR transcripts yield comparable correlations to human grading.