PoetryLab as Infrastructure for the Analysis of Spanish Poetry


  • Javier de la Rosa
  • Álvaro Pérez
  • Laura Hern´andez
  • Aitor D´ıaz
  • Salvador Ros
  • Elena Gonz´alez-Blanco




poetry, ontologies, linked open data, natural language processing


The development of the network of ontologies of the ERC POSTDATA Project brought to light some deficiencies in terms of completeness in the currently available European poetry corpora. To tackle the issue in the realm of the Spanish poetic tradition, our approach consisted in designing a set of tools that any scholar could use to automatically enrich the analysis of Spanish poetry. The effort crystallized in the PoetryLab, an extensible open source toolkit for syllabification, scansion, enjambment detection, rhyme detection, stanza identification, and historical named entity recognition for Spanish poetry. We designed the system to be interoperable, compliant with the project ontologies, easy to use by tech-savvy and non-expert researchers, and requiring minimal maintenance and setup. Furthermore, we propose the integration of the PoetryLab as a core functionality in the tool catalog of CLARIN for Spanish poetry.