Journal of Networks, Vol 5, No 9 (2010), 1017-1024, Sep 2010
doi:10.4304/jnw.5.9.1017-1024

An Efficient Method of Web Sequential Pattern Mining Based on Session Filter and Transaction Identification

Jingjun Zhu, Haiyan Wu, Guozhu Gao

Abstract


Web sequential pattern mining is an important way to analyze the access behavior of web users. In this paper, we present an efficient method of web sequential pattern mining based on session filter and transaction identification. Different from traditional mining methods, we categorize the user sessions into human user sessions, crawler sessions and resource-download user sessions. Then we filter out the non-human user sessions, leaving the human user sessions for sequential pattern mining. With the purpose of mining users’ meaningful sequential patterns, we identify users’ transactions from the user sessions, and do the sequential pattern mining based on transaction level. We present a method of transaction identification based on users’ access path tree. It can find out all the transactions effectively. We also make some improvements on PrefixSpan algorithm, which can reduce the memory space it takes and avoid generating duplicate projections. The experimental results of our mining method are very satisfactory.


Keywords


Web Mining; Sequential Pattern; Session Filter; Transaction Identification; PrefixSpan Algorithm

References



Full Text: PDF


Journal of Networks (JNW, ISSN 1796-2056)

Copyright @ 2006-2012 by ACADEMY PUBLISHER – All rights reserved.