Choosing the Right Tool for You
Informed Evaluation of Text Analysis Tools
DOI:
https://doi.org/10.3384/ecp216.10Keywords:
Named Entity Recognition, Historical Corpora, EvaluationAbstract
Natural Language Processing (NLP) research showcases many promising tools and methods for text analysis. Scholars from diverse fields who want to use NLP for their research are confronted with a wide availability of ready-to-use models that claim excellent performance on standard benchmarks. Consequently, choosing an appropriate tool has become a task on its own. Our goal is to exemplify a methodology that stimulates critical evaluation and detailed analysis of automatic outputs of NLP tools. Particularly, we analyze the case of choosing the best Named Entity Recognition (NER) tool for a corpus of Dutch biographies. Our use case is an example of how to make informed decisions by considering different aspects of custom datasets at the instance and aggregated levels, improving the outcomes of the original research question.