A New Semantic Similarity Scheme for more Accurate Identification in Medical Data

Colin Wilcox, Soufiene Djahel, Vasilios Giagos, Kristopher Welsh, Nicholas Costen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper aims to design a new measure of similarity between personal textual information retrieved from historic medical records to correct errors introduced due to poor encoding and data omission. The key motivation underlying our proposed layered algorithm, named Semantic Similarity scheme (SSIM), is to create a consistent, complete and accurate data set that may then be used as a basis for the identification and authentication of individuals in a medical context. Such consistent data may provide a basis for use as part of an access control system without compromising medical ethics or security. The obtained evaluation results, using four sample data sets from the UK, USA, Canada and Australia, highlight promising benefits compared to other similarity measures including Jaccard index, Sorensen-Dice and Cosine Similarity - especially when nicknames, abbreviations and synonyms are used to determine similarity.
Original languageEnglish
Title of host publication2023 IEEE International Smart Cities Conference
Subtitle of host publication(ISC2 2023)
PublisherIEEE
Publication statusAccepted/In press - 9 Jul 2023
Event9th IEEE International Smart Cities Conference - University Politehnica of Bucharest, Bucharest, Romania
Duration: 24 Sep 202327 Sep 2023
Conference number: 9
https://attend.ieee.org/isc2-2023/

Conference

Conference9th IEEE International Smart Cities Conference
Abbreviated titleISC2 2023
Country/TerritoryRomania
CityBucharest
Period24/09/2327/09/23
Internet address

Cite this