Apply deep learning to improve the question analysis model in the Vietnamese question answering system

Dang Thi Phuc; Dang Van Nghiem; Bui Binh Minh; Tran My Linh; Dau Sy Hieu

doi:10.11591/ijece.v13i3.pp3311-3321

Apply deep learning to improve the question analysis model in the Vietnamese question answering system

Dang Thi Phuc, Dang Van Nghiem, Bui Binh Minh, Tran My Linh, Dau Sy Hieu

Abstract

Question answering (QA) system nowadays is quite popular for automated answering purposes, the meaning analysis of the question plays an important role, directly affecting the accuracy of the system. In this article, we propose an improvement for question-answering models by adding more specific question analysis steps, including contextual characteristic analysis, pos-tag analysis, and question-type analysis built on deep learning network architecture. Weights of extracted words through question analysis steps are combined with the best matching 25 (BM25) algorithm to find the best relevant paragraph of text and incorporated into the QA model to find the best and least noisy answer. The dataset for the question analysis step consists of 19,339 labeled questions covering a variety of topics. Results of the question analysis model are combined to train the question-answering model on the data set related to the learning regulations of Industrial University of Ho Chi Minh City. It includes 17,405 pairs of questions and answers for the training set and 1,600 pairs for the test set, where the robustly optimized BERT pre-training approach (RoBERTa) model has an F1-score accuracy of 74%. The model has improved significantly. For long and complex questions, the mode has extracted weights and correctly provided answers based on the question’s contents.

Keywords

best matching 25; bidirectional encoder representations from transformers; natural language processin; question answering system deep learning;

Full Text:

PDF

DOI: http://doi.org/10.11591/ijece.v13i3.pp3311-3321

Copyright (c) 2023 Dang Thi Phuc, Dang Van Nghiem, Bui Binh Minh, Tran My Linh, Dau Sy Hieu

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by the Institute of Advanced Engineering and Science (IAES).

Username
Password
Remember me