Geographical queries reformulation using a parallel association rules generator to build spatial taxonomies

Omar El Midaoui, Btihal El Ghali, Abderrahim El Qadi

Abstract


Geographical queries need a special process of reformulation by Information Retrieval Systems (IRS) due to their specificities and hierarchical structure. This fact is ignored by most of web search engines. In this paper, we propose an automatic approach for building a spatial taxonomy, that models’ the notion of adjacency, that will be used in the reformulation of the spatial part of a geographical query. This approach exploits the documents that are in top of the retrieved list when submitting a spatial entity, which is composed of a spatial relation and a noun of a city. Then, a transactional database is constructed, considering each document extracted as a transaction that contains the nouns of the cities sharing the country of the submitted query’s city. The algorithm Frequent Pattern Growth (FP-Growth) is applied to this database in his parallel version (Parallel FP-Growth: PFP) in order to generate association rules, that will form the country’s taxonomy in a Big Data context. Experiments has been conducted on Spark and their results show that query reformulation using the taxonomy constructed based on our proposed approach improves the precision and the effectiveness of the IRS.

Keywords


big data; geographical query; information retrieval; machine learning ; parallel FP-growth algorithm; reformulation; spark; spatial entity;



DOI: http://doi.org/10.11591/ijece.v11i3.pp%25p
Total views : 0 times


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

ISSN 2088-8708, e-ISSN 2722-2578