Journal of Software, Vol 6, No 7 (2011), 1265-1272, Jul 2011
doi:10.4304/jsw.6.7.1265-1272

An Efficient Discriminant Analysis Algorithm for Document Classification

Ziqiang Wang, Xia Sun

Abstract


Document categorization has become one of the most important research areas of pattern recognition and data mining due to the exponential growth of documents in the Internet and the emergent need to organize them. The document space is always of very high dimensionality and learning in such a high dimensional space is often impossible due to the curse of dimensionality. To cope with performance and accuracy problems with high dimensionality, a novel dimensionality reduction algorithm called IKDA is proposed in this paper. The proposed IKDA algorithm combines kernel-based learning techniques and direct iterative optimization procedure to deal with the nonlinearity of the document distribution. The proposed algorithm also effectively solves the so-called “small sample size” problem in document classification task. Extensive experimental results on two real world data sets demonstrate the effectiveness and efficiency of the proposed algorithm.


Keywords


document classification;kernel discriminant analysis;dimensionality reduction;data mining

References


[1] F.Sebastiani, “Machine learning in automated text categorization,” ACM Computing Surveys,vol.34,pp.1-47, March 2002.
http://dx.doi.org/10.1145/505282.505283

[2] C.J.van Rijsbergen, Information Retrieval, 2nd edition. London:Butterworths,1979.

[3] I.T.Jolliffe, Principal Component Analysis. New York: Springer-Verlag,1986.

[4] K. Fukunaga, Introduction to Statistical Pattern Recognition, 2nd edition.New York:Academic,1990.

[5] R.O. Duda, P.E. Hart, and D.G. Stork, Pattern Classification, Second Edition. Hoboken:Wiley-Interscience,2000.

[6] X.Wang and X.Tang, “A unified framework for subspace face recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.26,pp.1222-1228,Septmber 2004.
PMid:15742896

[7] K.Torkkola,“Discriminative features for text document classification,” Pattern Analysis & Applications, vol.6, pp. 301-308,February 2004.
http://dx.doi.org/10.1007/s10044-003-0196-8

[8] Q.Liu,H.Lu,and S.Ma, “Improving kernel fisher discriminant analysis for face recognition,” IEEE Transactions on Circuits and Systems for Video Technology, vol.14,pp.42-49,January 2004.
http://dx.doi.org/10.1109/TCSVT.2003.818352

[9] J.Ye, T.Li, T.Xiong, and R.Janardan,“Using uncorrelated discriminant analysis for tissue classification with gene expression data,” IEEE/ACM Transactions on Computational Biology and Bioinformatics,vol.1,pp.181-190,October-December 2004.
http://dx.doi.org/10.1109/TCBB.2004.45
PMid:17051700

[10] P.N.Belhumeur, J. Hespanha, and D. J. Kriegman, “Eigenfaces vs. fisherfaces: recognition using class specific linear projection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.19,pp.711-720,July 1997.
http://dx.doi.org/10.1109/34.598228

[11] J. H. Friedman, “Regularized discriminant analysis,” Journal of the American Statistical Association, vol.84, pp.165-175, March 1989.
http://dx.doi.org/10.2307/2289860

[12] T. Hastie and R. Tibshirani, “Penalized discriminant analysis,” Annals of Statistics, vol.23,pp.73-102,January 1995.
http://dx.doi.org/10.1214/aos/1176324456

[13] M.Skurichina and R.P.W. Duin, “Stabilizing classifiers for very small sample size,” Proceedings of the 13th International Conference on Pattern Recognition,pp.891-896,August 1996.
http://dx.doi.org/10.1109/ICPR.1996.547204

[14] H.Yu and J. Yang,“A direct LDA algorithm for high-dimensional data-with application to face recognition,” Pattern Recognition, vol.34,pp.2067–2070,October 2001.
http://dx.doi.org/10.1016/S0031-3203(00)00162-X

[15] L. F. Chen, H. Y. M. Liao, M. T. Ko, J. C. Lin, and G. J. Yu, “A new LDA-based face recognition system which can solve the small sample size problem,” Pattern Recognition,vol.33,pp.1713-1726,October 2000.
http://dx.doi.org/10.1016/S0031-3203(99)00139-9

[16] R. Huang, Q. Liu, H. Lu, and S. Ma, “Solving the small sample size problem of LDA,” Proceedings of the 16th International Conference on Pattern Recognition,pp.29-32,December 2002.

[17] J.Ye, and T.Xiong, “Computational and theoretical analysis of null space and orthogonal linear discriminant analysis,” Journal of Machine Learning Research, vol.7, pp.1183-1204,July 2006.

[18] J. Ye, R. Janardan, Q. Li, and H. Park, “Feature extraction via generalized uncorrelated linear discriminant analysis,” Proceedings of the twenty-first international conference on Machine learning,pp.113-120,July 2004.

[19] J.Ye and Q.Li, “LDA/QR: an efficient and effective dimension reduction algorithm and its theoretical foundation,” Pattern Recognition,vol.37,pp.851-854,April 2004.
http://dx.doi.org/10.1016/j.patcog.2003.08.006

[20] P.Howland and H.Park,“Generalizing discriminant analysis using the generalized singular value decomposition,” IEEE Transactions on Pattern Analysis and Machine Intelligence,vol.26,pp.995-1006,August 2004.
http://dx.doi.org/10.1109/TPAMI.2004.46
PMid:15641730

[21] J. Ye, “Characterization of a family of algorithms for generalized discriminant analysis undersampled problems,” Journal of Machine Learning Research, vol.6, pp.483-502,April 2005.

[22] V. Vapnik, The Nature of Statistical Learning Theory. New York: Springer,1995.

[23] G.Baudat and F.Anouar, “Generalized discriminant analysis using a kernel approach,” Neural Computation, vol.12, pp.2385-2404,October 2000.
http://dx.doi.org/10.1162/089976600300014980
PMid:11032039

[24] K. R. Muller, S. Mika, G. Ratsch, K. Tsuda, and B. Scholkof, “An introduce to kernel based learning algorithm,” IEEE Transactions on Neural Networks, vol.12, pp.181-201.August 2002.
http://dx.doi.org/10.1109/72.914517
PMid:18244377

[25] J.Lu, K.N.Plataniotis, and A.N. Venetsanopoulos, “Face recognition using kernel direct discriminant analysis algorithms,” IEEE Transactions on Neural Networks, vol.14, pp.117-126, January 2003.
http://dx.doi.org/10.1109/TNN.2002.806629
PMid:18237995

[26] J.Yang, A.F. Frangi,Y.Yang,D.Zhang, and Z.Jin, “KPCA plus LDA: a complete kernel fisher discriminant framework for feature extraction and recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence,vol.27, pp.230-244,February 2005.
http://dx.doi.org/10.1109/TPAMI.2005.33
PMid:15688560

[27] C.H.Park and H.Park, “Nonlinear discriminant analysis using kernel functions and the generalized singular value decomposition,” SIAM Journal on Matrix Analysis and Applications,vol.27,pp.87-102,January 2005.
http://dx.doi.org/10.1137/S0895479804442334

[28] H.Wang,S.Yan,D.Xu,X.Tang, and T.Huang, “Trace ratio vs. ratio trace for dimensionality reduction,” Proceedings of 2007 IEEE Conference on Computer Vision and Pattern Recognition,pp.1-8,June 2007.
http://dx.doi.org/10.1109/CVPR.2007.382983

[29] C.-C. Chang and C.-J. Lin, LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm,2001.

[30] J.C.Platt, N.Cristianini, and J. Shawe-Taylor,“Large margin DAGs for multiclass classification,” Advances in Neural Information Processing Systems 12, pp.547-553, November 1999.

[31] Reuters-21578. http://www.daviddlewis.com/resources/testcollections/reuters21578/, 2004.

[32] 20 News Group. http://kdd.ics.uci.edu/databases/20newsgroups/20newsgroups.htm, 2004.

[33] B. Scholkopf, A. Smola, and K.Muller, “Nonlinear component analysis as a kernel eigenvalue problem,” MPI fur biologische kybernetik, Tubingen, Germany, Technology Report 44, 1996.

[34] W. Liu, Y. H.Wang, S. Z. Li, and T.N.Tan, “Null space-based kernel Fisher discriminant analysis for face recognition,” Proceedings of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition, pp.369-374, May 2004.
http://dx.doi.org/10.1109/AFGR.2004.1301558


Full Text: PDF


Journal of Software (JSW, ISSN 1796-217X)

Copyright @ 2006-2012 by ACADEMY PUBLISHER – All rights reserved.