Comparative analysis of deep Siamese models for medical reports text similarity
Abstract
Even though medical reports have been digitized, they are generally text data and have not been used optimally. Extracting information from these reports is challenging due to their high volume and unstructured nature. Analyzing the extraction of relevant and high-quality information can be achieved by measuring semantic textual similarity (STS). Consequently, the primary aim of this study is to develop and evaluate the performance of four models: Siamese Manhattan convolution neural network (CNN), Siamese Manhattan long short-term memory (LSTM), Siamese Manhattan hybrid CNN-LSTM, and Siamese Manhattan hybrid LSTM-CNN, in determining STS between sentence pairs in medical reports. Performance comparisons were conducted using Cosine Similarity and word mover's distance (WMD) methods. The results indicate that the Siamese Manhattan hybrid LSTM-CNN model outperforms the other models, with a similarity score of 1 for each sentence pair, signifying identical semantic meaning.
Keywords
Biomedical natural language processing; BioWordVec; Hybrid LSTM-CNN; Medical report; Semantic text similarity; Siamese Manhattan
Full Text:
PDFDOI: http://doi.org/10.11591/ijece.v14i6.pp6969-6980
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).