Potential of ASR for the study of L2 learner corpora

Authors

  • Sarra El Ayari Structures Formelles du Langage, CNRS & Paris 8 University
  • Zhongjie Li Structures Formelles du Langage, CNRS & Paris 8 University

DOI:

https://doi.org/10.3384/ecp211004

Keywords:

ASR, learner corpora, second language acquisition, WER

Abstract

This study is at the crossroads of Natural Language Processing (NLP) and Second Language Acquistion (SLA). We used Word Error Rate (WER) measurements of Whisper's speech recognition on a French L2 learner corpus to get automatic transcripts, and compared them with pre-existing manual transcripts. We then conducted quantitative and qualitative analysis of the issues which are inherent to the specificities of interlanguage for any automatic tool. We will discuss the different issues encountered by Whisper that are specific to learner corpora.

Author Biography

Sarra El Ayari, Structures Formelles du Langage, CNRS & Paris 8 University

 

 

Downloads

Published

2024-10-15