Comparative analysis of deep Siamese models for medical reports text similarity

Dian Kurniasari; Mustofa Usman; Warsono Warsono; Favorisen Rosyking Lumbanraja

doi:10.11591/ijece.v14i6.pp6969-6980

Comparative analysis of deep Siamese models for medical reports text similarity

Dian Kurniasari, Mustofa Usman, Warsono Warsono, Favorisen Rosyking Lumbanraja

Abstract

Even though medical reports have been digitized, they are generally text data and have not been used optimally. Extracting information from these reports is challenging due to their high volume and unstructured nature. Analyzing the extraction of relevant and high-quality information can be achieved by measuring semantic textual similarity (STS). Consequently, the primary aim of this study is to develop and evaluate the performance of four models: Siamese Manhattan convolution neural network (CNN), Siamese Manhattan long short-term memory (LSTM), Siamese Manhattan hybrid CNN-LSTM, and Siamese Manhattan hybrid LSTM-CNN, in determining STS between sentence pairs in medical reports. Performance comparisons were conducted using Cosine Similarity and word mover's distance (WMD) methods. The results indicate that the Siamese Manhattan hybrid LSTM-CNN model outperforms the other models, with a similarity score of 1 for each sentence pair, signifying identical semantic meaning.

Keywords

Biomedical natural language processing; BioWordVec; Hybrid LSTM-CNN; Medical report; Semantic text similarity; Siamese Manhattan

Full Text:

PDF

DOI: http://doi.org/10.11591/ijece.v14i6.pp6969-6980

Copyright (c) 2024 Dian Kurniasari, Mustofa Usman, Warsono Warsono, Favorisen Rosyking Lumbanraja Mail

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by the Institute of Advanced Engineering and Science (IAES).

Username
Password
Remember me