Protective Measures for Sharing the Finnish DarkWeb Marketplace Corpus (FINDarC)

Authors

  • Krister Lind´en Department of Digital Humanities, University of Helsinki, Finland
  • Teemu Ruokolainen Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland
  • Lasse H¨am¨al¨ainen Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland
  • J. Tuomas Harviainen Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland
  • Martin Matthiesen CSC - IT Center for Science, Espoo, Finland
  • Mietta Lennes Department of Digital Humanities, University of Helsinki, Finland

DOI:

https://doi.org/10.3384/ecp210013

Abstract

We discuss the archiving procedure of a corpus comprising posts submitted to Torilauta, a Finnish dark web marketplace website. The site was active from 2017 to 2021 and during this time one of the most prominent online illegal narcotics markets in Finland. A reduced version of the corpus, Finnish Dark Web Marketplace Corpus (FINDarC), has been archived in the Language Bank of Finland. In the current work, we focus on the protective measures for storing the data and how researchers can apply for access rights to the corpus under the CLARIN RES licence.

Downloads

Published

2024-07-09