Journal of Software, Vol 7, No 6 (2012), 1315-1320, Jun 2012
doi:10.4304/jsw.7.6.1315-1320

Semantically enhanced Uyghur Information Retrieval Model

Bo Ma, Yating Yang, Xi Zhou, Junlin Zhou

Abstract


Traditional Uyghur search engine lacks semantic information, aiming to solve this problem, a semantically enhanced Uyghur information retrieval model was proposed based on the characteristics of Uyghur language. Firstly word stemming was carried out and web pages were represented by the form of 3-triples to construct the Uyghur knowledge base, then the matching between ontologies and web pages was established by computing concept similarity and relation similarity. Semantic inverted index was built to save the association between semantic entities and web pages, and user query analysis was implemented by expanding the queries and analyzing the relations between the queries, finally by combining the benefits of both keyword-based and semantic-based methods, ranking algorithm was implemented. By comparing with the Google search engine and the Lucene based method, the experiments validate the effectiveness and the feasibility of the model preliminarily.


Keywords


Uyghur; ontology; semantic search; semantic relation; information retrieval

References


 

[1]Bo Ma, Yating Yang, Xi Zhou, Junlin Zhou. An Ontology-based Semantic Retrieval Model for Uyghur Search Engine[C] // IEEE 2nd Symposium on Web Society(SWS2010), Beijing: 2010: 191-195.

[2]S. Deerwester, S.T. Dumais, G.W. Furnas, T.K. Landauer, R. Harshman. Indexing by latent semantic analysis[J]. Journal of the Society for Information Science, 1990, 41 (6): 391–407.
http://dx.doi.org/10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9

[3]E. Vorhees. Query expansion using lexical semantic relations[C] // 17th Annual International ACMSIGIR Conference on Research and Devel- opment in Information Retrieval (SIGIR 1994), Dublin, Ireland: 1994: 61–67.

[4]A. Kiryakov, B. Popov, I. Terziev, D. Manov, D. Ognyanoff, Semantic annotation, indexing, and retrieval[J]. Journal of Web Semantics, 2004, 2 (1): 49–79.
http://dx.doi.org/10.1016/j.websem.2004.07.005

[5]B. Popov, A. Kiryakov, D. Ognyanoff, D. Manov, A. Kirilov, KIM—a semantic platform for information extraction and retrieval[J]. Journal of Natural Language Engineering, 2004, 10 (3–4): 375–392.
http://dx.doi.org/10.1017/S135132490400347X

[6]R.V. Guha, R. McCool, E. Miller, Semantic search[C]// the 12th International World Wide Web Conference (WWW2003), Budapest, Hungary: 2003: 700–709.

[7]http://www.hakia.com.

[8]TURDI Tohti, WINIRA Musajan, ASKAR Hamdulla.Key Technoques of Uyghur, Kazak, Kyrgyz Full-text Search Engine Retrieval Server[J]. Computer Engineering,2008,34(21):44-46.

[9]MATTHEWS, P. H. 1991. Morphology 2nd Ed. Cambridge Textbooks in Linguistics.

[10]Creutz, Mathias. Unsupervised Models for Morpheme Segmentation and Morphology Learning. ACM Transactions on Speech and Language Processing 2007; 4(1).
http://dx.doi.org/10.1145/1187415.1187418

[11]Gruber T R. Toward principles for the design of ontologies used for knowledge sharing. International Journal Human Computer Studies, 1995, 43 (5-6) : 907-928
http://dx.doi.org/10.1006/ijhc.1995.1081

[12]WANG ZhiXiao, ZHANG DaLu.Optimization Algorithm for Edge-Based Semantic Similarity Calculation[J].PR & AI,2010,23(2):273-277.

[13]M. Fernández, D. Vallet, P. Castells, Probabilistic score normalization for rank aggregation[C]// 28th European Conference on Information Retrieval (ECIR 2006), London, UK: 2006: 553–556.


Full Text: PDF


Journal of Software (JSW, ISSN 1796-217X)

Copyright @ 2006-2013 by ACADEMY PUBLISHER – All rights reserved.