A Novel Weighted Phrase-Based Similarity for Web Documents Clustering
Abstract
Keywords
References
N. Oikonomakou, and M. Vazirgiannis, "A Review of Web Document Clustering Approaches," Data Mining and Knowledge Discovery Handbook, pp. 921-943: Springer US, 2005.
L. Yanjun, “Text Clustering with Feature Selection by Using Statistical Data,” IEEE Transactions on Knowledge and Data Engineering, vol. 20, pp. 641-652, 2007.
http://dx.doi.org/10.1109/TKDE.2007.190740
Y. Li, S. M. Chung, and J. D. Holt, “Text Document Clustering Based on Frequent Word Meaning Sequences,” Data & Knowledge Engineering, vol. 64, no. 1, pp. 381-404, 2008.
http://dx.doi.org/10.1016/j.datak.2007.08.001
K. M. Hammouda, and M. S. Kamel, “Efficient Phrase-Based Document Indexing for Web Document Clustering,” IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 10, pp. 1279-1296, 2004.
http://dx.doi.org/10.1109/TKDE.2004.58
H. Chim, and X. Deng, “Efficient Phrase-Based Document Similarity for Clustering,” IEEE Transactions on Knowledge and Data Engineering, vol. 20, no. 9, pp. 1217-1229, 2008.
http://dx.doi.org/10.1109/TKDE.2008.50
O. Zamir, and O. Etzioni, "Web Document Clustering: A Feasibility Demonstration," Proceedings of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 46-54, 1998.
S. Zu Eissen, B. Stein, and M. Potthast, “The Suffix Tree Document Model Revisited,” in Proceedings of the 5th International Conference on Knowledge Management (I-KNOW 05), Graz, Austria, 2005, pp. 596-603.
C. Manning, P. Raghavan, and H. Schütze, "An introduction to information retrieval," p. 377~400, Cambridge,England: Cambridge University Press, 2009.
E. Ukkonen, “On-Line Construction of Suffix Trees,” Algorithmica, vol. 14, no. 3, pp. 249-260, 1995.
http://dx.doi.org/10.1007/BF01206331
C. Carpineto, and G. Romano. "Ambient Dataset," 2008; http://credo.fub.it/ambient/.
S. OsiĆski, and D. Weiss, “Carrot 2: Design of a Flexible and Efficient Web Information Retrieval Framework,” Advances in Web Intelligence, vol. 3528, pp. 439-444, 2005.
http://dx.doi.org/10.1007/11495772_68
Full Text: PDF


