Investigating the pagerank and sequence prediction based approaches for next page prediction

Nguyen Thon Da, Tan Hanh

Abstract


Discovering unseen patterns from web clickstream is an upcoming research area. One of the meaningful approaches for making predictions is using sequence prediction that is typically CPT+ (the improved Compact Prediction Tree). However, to increase this method's effectiveness, combining it with at least other methods is necessary. This work investigates such PageRank-based methods related to sequence prediction as All-K-Markov, DG, Markov 1st, CPT, CPT+. The experimental results proved that the integration of CPT+ and PageRank is the right solution for next page prediction in terms of accuracy, which is more than a standard method of approximately 0.0621%. Still, the size of the newly created sequence database is reduced up to 35%. Furthermore, our proposed solution has an accuracy that is much higher than other ones. It is intriguing for the next phase (testing one) to make the next page prediction in terms of time performance.

Keywords


CPT+; markov; pagerank; sequence prediction;



DOI: http://doi.org/10.11591/ijece.v11i3.pp%25p
Total views : 0 times


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

ISSN 2088-8708, e-ISSN 2722-2578