Journal of Computers, Vol 6, No 3 (2011), 466-473, Mar 2011
doi:10.4304/jcp.6.3.466-473

A New Intelligent Topic Extraction Model on Web

Ming Xie, Chanle Wu, Yunlu Zhang

Abstract


We tackle the problem of topic extraction on Web. In this paper, we propose an approach to implementing ontology-based data access in WordNet with the distinguishing feature of optimizing density-based clustering OPTICS algorithm (DBCO) to extract topics. Our solution has the following two desirable properties: i) it uses WordNet for word sense disambiguation of words in the learning resources documents and ii) it mapping the data space of the original method to a vector space of sentence, improving the original OPTICS algorithm. We outline the interface between our scheme and the current data Web, and show that, in contrast to the existing approaches, no exponential blowup is produced by the DBCO. Based on the experiments with a number of real-world data sets of 310 users in three study sites, we demonstrate that topic extraction in the proposed approach is efficient, especially for large-scale web learning resources. According to the user ratings data of four learning sites in the 150 days, the average rate of increase of user rating after the system is used reaches 25.18%.


Keywords


Topic Extraction, E-learning, Semantic, Ontology

References


[1]     S.B. Kim , H. C. Seo and H. C.  Rim. Information Retrieval using Word Senses: Root Sense Tagging Approach [A]. In : Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval[C]. ACM Press, 2004. 258-265

[2]     S. Liu, F. Liu, C. Yu and W. Meng. An Effective Approach to Document Retrieval via Utilizing WordNet and Recognizing Phrases[A]. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval[C]. ACM Press, 2004. 266-227

[3]     S. Liu, C. Yu and W. Meng. WordSense Disambiguation in Queries [A]. In: Proceed2ings of the 14th ACM International Conference on Information and Knowledge Management[C]. ACM Press, 2005. 525-532

[4]     M. Ankerst, M. M. Breunig, H. P. Kriegel, J. Sander. OPTICS: ordering points to identify the clustering structure. International Conference on Management of Data [A]. In: Proceedings of the 1999 ACM SIGMOD international conference on Management of data[C]. ACM Press,1999. 49-60

[5]     R. Barzilay, N. Elhadad, K. McKeown, Inferring Strategies for Sentence Ordering in Multidocument News Summarization. in Journal of Artifical Intelligence Research (JAIR) [J], 2002, Vol. 17, pp 35-55

[6]     C. Y. Lin, C.Y. Lin and E. Hovy. Automated Multi-document Summarization in NeATS. In Proceedings of the Human Language Technology Conference[C]. 2002

[7]     Y. Li, Q. Zhong, J. Li, J. Tang, Result of ontology alignment with RiMOM at OAEI’07. In Proc. of International Workshop of Ontology Matching on the 6th International Semantic Web Conference[C].

[8]     W. Hu, Y. Zhao, Y. Qu, Partition-based Block Matching of Large Class Hierarchies[C] Proc. of the 1st Asian Semantic Web Conference. Beijing, China, 2006: 72-83.


Full Text: PDF


Journal of Computers (JCP, ISSN 1796-203X)

Copyright @ 2006-2012 by ACADEMY PUBLISHER – All rights reserved.