Investigating the PageRank and sequence prediction based approaches for next page prediction

Nguyen Thon Da, Tan Hanh

Abstract


Discovering unseen patterns from web clickstream is an upcoming research area. One of the meaningful approaches for making predictions is using sequence prediction that is typically the improved compact prediction tree (CPT+). However, to increase this method's effectiveness, combining it with at least other methods is necessary. This work investigates such PageRank-based methods related to sequence prediction as All-K-Markov, DG, Markov 1st, CPT, CPT+. The experimental results proved that the integration of CPT+ and PageRank is the right solution for next page prediction in terms of accuracy, which is more than a standard method of approximately 0.0621%. Still, the size of the newly created sequence database is reduced up to 35%. Furthermore, our proposed solution has an accuracy that is much higher than other ones. It is intriguing for the next phase (testing one) to make the next page prediction in terms of time performance.

Keywords


CPT+; markov; PageRank; sequence prediction;

Full Text:

PDF


DOI: http://doi.org/10.11591/ijece.v11i3.pp2229-2237

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578