Implant Term Extraction from Swedish Medical Records – Phase 1: Lessons Learned


  • Oskar Jerdhaf
  • Marina Santini
  • Peter Lundberg
  • Anette Karlsson
  • Arne Jönsson



terminology extraction, BERT, implant terms, term clusters, focused terminology extraction, KDTree, BallTree


We present the case of automatic identification of “implant terms”. Implant terms are specialized terms that are important for domain experts (e.g. radiologists), but they are difficult to retrieve automatically because their presence is sparse. The need of an automatic identification of implant terms spurs from safety reasons because patients who have an implant may be at risk if they undergo Magnetic Resonance Imaging (MRI). At present, the workflow to verify whether a patient could be at risk of MRI side-effects is manual and laborious. We claim that this workflow can be sped up, streamlined and become safer by automatically sieving through patients’ medical records to ascertain if they have or have had an implant. To this aim we use BERT, a state-of-the-art deep learning algorithm based on pre-trained word embeddings and we create a model that outputs term clusters. We then assess the linguistic quality or term relatedness of individual term clusters using a simple intra-cluster metric that we call cleanliness. Results are promising.