Convolutional recurrent neural network with template based representation for complex question answering

Received Aug 5, 2018 Revised Nov 24, 2019 Accepted Dec 8, 2019 Complex Question answering system is developed to answer different types of questions accurately. Initially the question from the natural language is transformed to an internal representation which captures the semantics and intent of the question. In the proposed work, internal representation is provided with templates instead of using synonyms or keywords. Then for each internal representation, it is mapped to relevant query against the knowledge base. In present work, the Template representation based Convolutional Recurrent Neural Network (T-CRNN) is proposed for selecting answer in Complex Question Answering (CQA) framework. Recurrent neural network is used to obtain the exact correlation between answers and questions and the semantic matching among the collection of answers. Initially, the process of learning is accomplished through Convolutional Neural Network (CNN) which represents the questions and answers separately. Then the representation with fixed length is produced for each question with the help of fully connected neural network. In order to design the semantic matching between the answers, the representation of Question Answer (QA) pair is given into the Recurrent Neural Network (RNN). Finally, for the given question, the correctly correlated answers are identified with the softmax classifier.


INTRODUCTION
In recent years, the task of question answering plays a major role of information retrieval in human computer interaction. The required information is described in the form of questions or statements [1]. Question Answering systems presenting an interface, where users could state their demand for information in the Natural Language format and the search engine will produce suitable answers to these questions. When compared with the traditional information retrieval system, the relevant amount of information is considered as an answer instead of retrieving the entire document [2]. Data mining is a subfield of computer science that enables intelligent extraction of useful information [3]. The user is expecting correct, comprehensible and concise answer which may indicates the sentence, image, word, paragraph, audio fragment or the whole document [4]. The problem behind this approach is converting the user information in the form of evaluation. It can be accomplished by inferencing and Semantic Web query processing approaches. The application will take advantage of this class structure to determine the semantic similarity. Street and city semantically are closer than street and time [5].
The combination of question focus and question topic is considered as a question. The topics of the question commonly have the conditions or context based on the user characteristics of interest [6]. The interest of the answer is identified by the searching topics which are entered. Several factors of question topic are formulated in question focus [7]. If the user modifies the topic in Community question answering (CQA), it is shown by the topic variation of question and answer. CQA is a type of information retrieval was user need from the community is given in the form of natural language question and community response is in the form of natural language answer. This structure is highly complex in which the properties are sufficient for constructing QA framework at the presence of structured data sources and domain ontology [8]. Due to the capability of replacing and extending the components with various implementations, the portability is obtained at the domain with respect to the targeted domain or ontology [9]. To tackle this issue, wide range of techniques has been developed for answering the questions of natural language based on heterogeneous and large structure of data [10,11]. Initially, the exact questions of any topic will be posted by the data seekers and it allows the other users to share their answers. By influencing the effects of community, they can able to obtain correct answers with the help of search engines [12]. Better quality of result is generated with the human intelligence system when compared with the CQA and automated QA system. The incredible amount of QA pairs has been developed with the data storage and it provides the search and preservation of evaluated questions [13].
Question answering is the process of finding answering for the question posted in common language automatically. QA framework requires the capability of understanding natural language linguistics, text and common knowledge [14]. The correct answer from the given set of documents can be determined by the QA system by using the question of natural language [15]. Natural Language Processing (NLP) approach is commonly developed in QA system for replying user's question. In addition to that, same kinds of issues are termed uniquely in several QA systems. The natural language overhead like ambiguity, co-reference, Implicitness, inference etc. created hindrance in sentiment analysis too [16]. Natural language search, on the other hand, attempts to use natural language processing to understand the nature of the question and then to search and return a subset of the repositories such as web databases that contains the answer to the question. The result will be more credible and of higher relevance than results from a keyword search engine [17].
Recently, the efficient question answering websites are invented to determine the answer automatically [18]. This kind of question answering system faces some issues during its processing [19]. For resolving the question, certain quantities of questions are waiting and some of them are eliminated by the user who answers that and the large amount of time is wasted for getting answer. In worst cases, the suitable answer is not obtained from the experts by the asker [20]. The websites such as question answering through knowledge sharing is affected by this kind of problems. Hence it is required to detect the necessary experts for each question, thus it improves the answer quality, minimizing the waiting time and improves the efficiency of knowledge sharing [21].
However, the issue is that a human can ask a question in variety of ways. To solve this kind of issues several techniques has been developed with large structured and heterogeneous data for question answering in natural languages. Previous approaches have natural limits due to their representations: rule based approaches are utilized for processing the less collection of "canned" questions, while keyword based or synonym based methods cannot fully understand the questions. Hence template based approaches and better answer selection is required for efficient question answering system. In this work two parallel Convolutional Neural Networks (CNNs) are utilized to learn the semantic correlation pattern among question answer pair. The proposed approach effectively captures the valuable context from the sequence of answers.
The organization of our work in this paper is described as follows. Section 2 describes the proposed approach of Complex Question Answering. Section 3 explains the research method, template based convolutional recurrent neural network (T-CRNN) for complex question answering. In Section 4, the experimental results of T-CRNN are analyzed. Section 5 illustrates the significant aspects of the work and concludes.

THE PROPOSED METHOD
This section describes the proposed approach of complex question answering. For complex question answering system, Binary factoid questions (BFQs) are taken into account and it has the specific property of an entity. Initially, the entity from the input question is replaced with the templates. Based on the replaced entities it uses divide and conquer approach for decompose the question. Then the CNN, RNN and scoring is used for determining the correct answer. The proposed approach of CQA is shown in Figure 1.

Question representation
The question should be represented for answering the question. Question representation is the process of converting the question from the natural language to its internal representation for capturing the intents and semantics of that question. Matching is an important concept in the area of semantic web such as ontology interoperability, data sharing, and document classification etc [22]. For question representation, the knowledge base is utilized. The internal representation is denoted as template , where, = ( , , ) and represents the category. Figure 2 shows the internal representation of question representation. Each question is considered as a set of entities and the templates from the knowledge base is used to represent the question.
The entity of the question is replaced by its concept and it can be achieved through the process of conceptualization. This approach is depending on the network of large semantics (Probase [23]) and it contains millions of concepts. Hence sufficient granularity is there for representing all kind of questions. The knowledge base of each predicate is based on several templates. For our work, 27,126,355 templates are learned for 2782 predicates. The large amount of templates indicates wide coverage for template based CQA.

Complex question decomposition
For complex question decomposition, divide and conquer approach is used in the proposed CQA system. Initially, the input question is decomposed into set of BFQ's, then each are answered separately. The block diagram for complex question decomposition is shown in Figure 3. The decomposed question contains modified entities except the first question. After assigning the variable with specific entity which is the answer of previous question, only the question sequence is materialized.
The answer for the given question is obtained with RDF knowledge base which is in the structure of ( , , ). Where, denotes the subject, represents the predicate and represents the predicates. The answer obtained for the question is represented in the form of represents the question and represent the answer for the question . The answer 1 contains the exact factoid answer of various sentences.

RESEARCH METHOD
This section describes the research method, template based convolutional recurrent neural network (T-CRNN) for complex question answering.

Semantic matching pattern
In order to obtain the sematic matching pattern between the question and answer, the CNN is used in the Research method T-CRNN. The Block diagram of CNN based sentence model is shown in Figure 4. The CNN is utilized to learn the representation from the QA pair ( , ). For that initially, the distributional representation of question and answer ( , ) is learned separately. Then the fixed length representation is extracted using fully connected hidden layer.  The matrix of pre trained word embedding ∈ ×| | is utilized for providing distributional vector for each word from the sentence = ( 1 , … , … | | ). Where, the number of vocabulary is denoted as | |, denotes the length of word vector. The input sentence matrix ∈ ×| | is constructed and it is given as an input for convolutional sentence model. The convolutional model contains several convolutional and pooling layers which provide high level representation of sentence vector. Convolutional Neural Network (CNN) is a special kind of deep neural network [24]. a. Convolution From the input sentence matrix Y, the feature map parameters (e (v,j) , d (v,j) ) are computed in the convolutional layer and convolutional filter . The output for the convolutional filter is the feature map . In the output layer , the feature map parameters are in the form of 2 dimensional arrays. The sliding windows of words with width 1 are utilized for feature mapping in the first layer of convolutional unit. The output for the convolutional layer is computed as b. Pooling Max pooling is applied to the output of convolution layer for every two unit feature map window. It can be formulated as, The useless word composition is filtered and meaningful semantic or syntactic structure is captured through pooling operation. c. Matching Two convolutional sentence models ( , . ) are derived for the given input QA pair ( , ) for getting distributional representation of sentence ( , ). It is denoted as: Here, the parameters of convolutional sentence model are denoted as , . The sentence vector ( , ) and feature vectors are integrated into a concatenated vector = [ , , ] using a joint layer. The fully connected hidden layer is used for learning the representation of QA pair by taking the concatenated vector.
where, (. ) is the activation function, and ( , ) denotes the parameters of hidden layer. The output for the input QA pair ( , ) is the learnt representation.

Context dependent representation with T-CRNN
In order to find the correlation from the sequence of answers the RNN is utilized. There are two constraints for the correlation among answers. For the subsequent answer, the precedent answer may contain the context information and the comments for the precedent answer are available in the sub sequent answer. The bi directional RNN is utilizedfor modelling the semantic correlation between answers and for gathering the context based representations of the set of answers.
For modeling the basic block for the semantic correlation among answers, the Long Short-Term Memory (LSTM) unit is used. It is possible to modulate the memory instead of overwriting the stateseach time. The memory cell is a key equipment of LSTM unitand its state is varied over time. The LSTM unit takes decision regardingthe modification and addition of memory in the cell through sigmoidal unit. It includes the input unit , output unit and forget unit .
The QA pair representation for the current time , the memory cell is modified through the activation of forget unit and input unit . The updated equation is denoted as follows: The context of the LSTM unit is updated by eliminating the unwanted context in forget module and append the new part from input module . The extensions for modulating these two contexts are computed as follows: For the updated cell state , the representation and the output from the LSTM unit are computed as, where, * ∈ × , * ∈ × , ( * , * ) represents the LSTM unit parameters, (. ) represents the sigmoid activation function, is the dimension of hidden layer, denotes the dimension of QZ pair representation and , , are diagonal matrices. The bi directional RNNs are utilized for parallely learning the QA pairs from the QA pair representation = [ 1 , … , , … , | | ] in forward and reverse direction. It uses both future and previous context in the answer sequence. The bi directional RNNs ( ⃗ ⃗ , ⃖⃗ ⃗ ) are utilized in addition with the fixed representation of specific QA pair ( , ) for obtaining the context dependent representation.
where ⃖⃗⃗⃗⃗⃗⃗⃗⃗ , ⃗⃗⃗⃗⃗⃗⃗⃗ represents the concatenate parameters of two directional RNN. From the learning of bi directional sequence the learnt representation is combined and the context dependent representation is computed for the QA pair ( , ).
In order to compute the score = [ 1 , … , , … , | | ] over the answer class , is given into the scoring process. It can be computed as, where ∈ × , ( , ) denotes the scoring parameter, represents the number of answer classes, and denotes the final representation of . The prediction for the current answer is computed with the softmax function by using .

RESULTS AND DISCUSSION
The dataset used for our work is Factoid Q&A Corpus [25] which contains 1,714 manually-created factoid questions and their relevant answers collected by the University of Pittsburgh and Carnegie Mellon University between 2008 and 2010. The configurations used in our proposed T-CRNN are described as follows. For modelling both question and answer, the CNN is constructed with 3 convolutional and pooling layers. The number of feature map utilized for each layer is 100. The convolution window size of each layer is 5x100, 4x1, 2x1 and the pooling window size is 4x100, 3x1, 3x1. This configuration is also denoted as (5,4,2) and (4,3,3). The RNN is modelled with 2 LSTM units and the size of all LSTM units is set to 360. Pre-trained word embedding is used with the dimension of 100. For each sentence, the maximum length is set to 100.The proposed T-CRNN is compared with the existing approaches such as JAIST, ICRC and RCNN. The performance of the proposed work is validated with performance measures. The increased value of these measures shows percentage of improvement. The performance measures such as precision, recall, F-measure and accuracy are estimated.

Precision
Precision computes the amount of positive instance among the total amount of retrieved instances. It is also known as positive predictive value. It is also termed as sensitivity. The precision value comparisons with different approaches for varying the number of documents are shown in Figure 5. For the existing approaches, the precision is in the range between 0.4 and 0.6 when varying the number of documents to 300, 500, 700 and 1000. The proposed approach reaches the precision level up to 0.98. When the number of documents is increased, then there is degradation in precision value. The existing approaches provide equal performance and it is less than 0. 6

F-measure
The F-measure is computed as a harmonic mean between precision and recall. It is also termed as balanced F-score. It determines the average of both precision and recall when they are close.

Accuracy
The accuracy of the proposed work is estimated based on the amount of correct answer selection. The accuracy is calculated with the number of documents as 300, 500, 700 and 1000. The accuracy of 300 documents is higher than the accuracy of the system with 1000 documents. This shows the increase in the number of documents reduces the accuracy of the CQA system. The accuracy comparison for varying the number of document is shown in Figure 11. The accuracy of proposed T-CRNN is higher than other conventional approaches. The average accuracy for the proposed approach is higher than the conventional approaches is shown in Figure 12. The precision, recall, f-measure and accuracy result comparison shows the improved performance of the proposed CQA system.

CONCLUSION
In this work T-CRNN is proposed for answer selection in complex question answering system. Initially, the given input question is decomposed into several questions. It is accomplished with replacing the each entity of the question with the template through the use of knowledge base. Then the sentence matrix of question and answer pair is created and it is represented in vector form by the approach of pre-trained word embedding. The semantic matching pattern between question and answer is obtained with CNN. The semantic correlation among sequence of answers is achieved with RNN. The scoring for each answer is computed with multi-layer perceptron and softmax classifier.The proposed approach is evaluated with the performance measures such as precision, recall, f-measure and accuracy. The performance of the proposed approach is evaluated with the conventional methods and it shows the efficiency of the proposed CQA system.