Analyses of information security standards on data crawled from company web sites using SweClarin resources


  • Arne Jönsson Computer and Information Science, Link¨oping University, Link¨oping, Sweden
  • Subhomoy Bandyopadhyay Management and Engineering, Link¨oping University, Link¨oping, Sweden
  • Svjetlana Pantic Dragisic Management and Engineering, Link¨oping University, Link¨oping, Sweden
  • Andrea Fried Management and Engineering, Link¨oping University, Link¨oping, Sweden



With the purpose of analysing Swedish companies’ adherence and adoption of the information security standard ISO 27001 and to examine the communicative constitution of preventive innovation in organisations, we have created a corpus of corporate texts from Swedish company websites. The corpus was analysed from multiple interdisciplinary perspectives in close cooperation with management researchers and SweClarin researchers using SweClarin tools and resources as well as standard language technology tools. Some analyses require deep reading, which was performed by management researchers, often guided by results from language analyses. Initial results have been presented at a management studies conference. In this paper, we focus on presenting the research issues, the methods used in the project, the results, and the experience of SweClarin researchers supporting researchers in social sciences. Our contribution is to show how it is possible, through the integration of human insights and digital methods, to increase the credibility and validity of a digitally acquired data set and subsequent research findings. In our view, a combination of human deep reading (management researchers), contextual lexical verification (management studies) and language technology (content and sentiment analysis) can help to sensitise computational text analysis for medium-sized data sets.