Journal of Networks, Vol 6, No 5 (2011), 728-735, May 2011
doi:10.4304/jnw.6.5.728-735

Incremental Mining of Closed Sequential Patterns in Multiple Data Streams

Shih-Yang Yang, Ching-Ming Chao, Po-Zung Chen, Chu-Hao Sun

Abstract


Sequential pattern mining searches for the relative sequence of events, allowing users to make predictions on discovered sequential patterns. Due to drastically advanced information technology over recent years, data have rapidly changed, growth in data amount has exploded and real-time demand is increasing, leading to the data stream environment. Data in this environment cannot be fully stored and ineptitude in traditional mining techniques has led to the emergence of data stream mining technology. Multiple data streams are a branch of the data stream environment. The MILE algorithm cannot preserve previously mined sequential patterns when new data are entered because of the concept of one-time fashion mining. To address this problem, we propose the ICspan algorithm to continue mining sequential patterns through an incremental approach and to acquire a more accurate mining result. In addition, due to the algorithm constraint in closed sequential patterns mining, the generation and records for sequential patterns will be reduced, leading to a decrease of memory usage and to an effective increase of execution efficiency.


Keywords


Multiple Data Streams, Data Stream Mining, Sequential Pattern Mining, Incremental Mining

References


[1] Chang, J.H. and Lee, W.S., “Efficient Mining Method for Retrieving Sequential Patterns over Online Data Streams,” Journal of Information Science, Vol. 31, Issue 5, pp. 420- 432 (2005).
doi:10.1177/0165551505055405

[2] Ezeife, C.I. and Monwar, M., “SSM: A Frequent Sequen-tial Data Stream Patterns Miner,” Pro-ceedings of the 2007 IEEE Symposium on Computational Intelligence and Data Mining, Hawaii, USA, pp. 120-126 (2007).

[3] Ho, C.C., Li, H.F., Kuo, F.F., and Lee, S.Y., “Incremental Mining of Sequential Patterns over a Stream Sliding Win-dow,” Proceedings of the 6th IEEE International Confer-ence on Data Mining, Hong Kong, China, pp. 677-681 (2006).

[4] Raissi, C., Poncelet, P. and Teisseire, M., “SPEED: Mining Maximal Sequential Patterns over Data Streams,” Proceed-ings of the 3rd In-ternational IEEE Conference Intelligent Systems, Varna, Bulgaria, pp.546-552 (2006).

[5] Oates, T. and Cohen, P.R., “Searching for Structure in Multiple Streams of Data,” Proceedings of the 13th Inter-national Conference on Machine Learning, Bari, Italy, pp. 346-354 (1996).

[6] Chen, G., Wu, X., and Zhu, X., “Sequential Pattern Mining in Multiple Streams,” Pro-ceedings of the 5th IEEE Inter-national Conference on Data Mining, Houston, USA, pp. 27-30 (2005).

[7] Das, G., Lin, K.-I., Mannila, H., Renganathan, G., and Smyth, P., “Rule Discovery from Time Series,” Proceed-ings of the 4th International Conference of Knowledge Discovery and Data Mining, New York, USA, pp. 16-22 (1998).

[8] Chang, J. and Lee, W., “Decaying Obsolete Information in Finding Recent Frequent Itemsets over Data Stream,” IEICE Transaction on Information and Systems, Vol. 87, No. 6, pp. 1588-1592 (2004).

[9] Yang, J., Wang, W., Yu, PS., and Han, J., “Mining Long Sequential Patterns in a Noisy Environment,” Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, New York, USA, pp. 406-417 (2002).
doi:10.1145/564691.564738


Full Text: PDF


Journal of Networks (JNW, ISSN 1796-2056)

Copyright @ 2006-2013 by ACADEMY PUBLISHER – All rights reserved.