A Novel Method for Speech Data Mining
Abstract
Text-to-Speech (TTS) system is one to translate given text to speech which can be used in various applications such as information releasing systems, voice response devices, voice services in E-mail and reading machines for the blind. Great progress has been made in the research on Chinese TTS systems and several Chinese TTS systems have been published. However, because of the complexity of Chinese, the current available speech patterns are not very fine. The speech quality of those systems developed from these patterns is not good enough to meet the needs of users. The main purpose of this paper is to gain a refined prosodic model of Chinese speech. Traditional methods are not used in this thesis and data mining techniques are employed. Data mining is the process of discovering advantageous patterns in database. There are now many data mining algorithms, one of which is neural network. This paper presents a data mining system using clustering algorithm to find useful patterns from Chinese speech database. Study on the tone changes of Chinese two- word phrases has been made and good results have been achieved. They are helpful to develop high quality Chinese TTS systems.
Keywords
References
[1] Y. Freund and R. Schapire, “A short introduction to boosting,” J. Japanese Soc. Artificial Intell., vol. 14, no. 5, pp. 771–780,2008.
[2] M. Rochery, R. Schapire, M. Rahim, N. Gupta, G. Riccardi, S. Bangalore, H. Alshawi, and S. Douglas, “Combining prior knowledge and boosting for call classification in spoken language dialogue,” in Proc. ICASSP.
[3] R. Schapire and Y. Singer, “BoosTexter: a boosting-based system for text categorization,” Machine Learning, vol. 39, no. 2/3, pp. 135–168, 2008.
doi:10.1023/A:1007649029923
[4] A. K. Jain and R. C. Dubes, Algorithms for Clustering Data. Englewood Cliffs, NJ: Prentice-Hall,2008.
[5] A. K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering: a review,” ACM Comput. Surveys, vol. 31, no. 3, pp. 264–323, Sep.2008.
doi:10.1145/331499.331504
[6] J. Ajmera, H. Bourlard, I. Lapidot, and I. McCowan, “Unknown-multiple speaker clustering using HMM,” in Proc. ICSLP, Denver, CO, 2008, pp. 573–576.
[7] I. Drucker, I. Gibbon, and I. Shahraray, Support vector machines: relevance feedback and information retrieval, in Inf. Process. Manage., vol. 28, pp. 305–332, 2002.
doi:10.1016/S0306-4573(01)00037-1
[8] H. Drucker, D. Gibbon, and B. Shahraray, “Relevance feedback using support vector machines,” in Proc. Int. Conf. Machine Learning.
[9] C. Van Rijsbergen, Information Retrieval, Second ed. London, U.K.: Butterworth, 1979.
[10] A. Abella and A. Gorin, “Construct algebra: analytical dialog management,” in Proc. Annu. Meet. Assoc. Computat. Linguistics, Washington, DC, Jun. 1999.
[11] A. L. Gorin, G. Riccardi, and J. H. Wright, “How may I help you?,” in Speech Commun., vol. 23, 1997, pp. 113–127.
[12] M. F. Porter, “An algorithm for suffix stripping,” Program, vol. 14, no. 3, pp. 130–137, 1980.
[13] V. N. Vapnick, Statistical Learning Theory. New York: Wiley, 1998.
[14] C. Platt, “Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods,” in Advances in Large Margin Classifiers, A. Smola, P. Bartlett, B. Schölkopf, and D. Schuurmans, Eds. Cambridge, MA: MIT Press, 1999, pp. 61–74.
[15] Z. Xu, X. Xu, K. Yu, V. Tresp, and J. Wang, “A hybrid relevance-feedback approach to text retrieval,” in Proc. 25th Eur. Conf. Inf. Retrieval Res., Pisa, Italy, Apr. 14–16, 2008.
[16] C. Cortes, P. Haffner, and M. Mohri, “Rational Kernels: theory and algorithms,” J. Machine Learning Res., vol. 5, pp. 1035–1062, August 2004.
[17] J. M. Buhman and T. Hofmann, “A maximum entropy approach to pairwise data clustering,” in Proc. Int. Conf. Pattern Recogn., 1994, pp. 207–212.
[18] S. Guha, R. Rastogi, and K. Shim, CURE: An Efficient Clustering Algorithm for Large Databases. Seattle, WA: SIGMOD, Jun. 1998, pp. 73–84.
[19] G. Karypis, E.-H. Han, and V. Kumar, “CHAMELEON: hierarchical clustering using dynamic modeling,” IEEE Comput., vol. 32, no. 8, pp. 68–75, Aug. 1999.
[20] G. Salton and C. Buckley, “Term weighting approaches in automatic text retrieval,” Inf. Process. Manage., vol. 5, no. 24, pp. 513–523, 1988.
doi:10.1016/0306-4573(88)90021-0
[21] J. Chu-Carroll and B. Carpenter, Dialogue management in vector-based call routing, in Comput. Linguistics, 1998.
[22] M. A. Hearst, “Untangling text data mining,” in Proc. ACL 37th Annu. Meet. Assoc. Comput. Linguistics. College Park, MD, Jun. 20–26, 1999.
[23] J. Lafferty and C. Zhai, “Document language models, query models, and risk minimization for information retrieval,” in Proc. ACM SIGIR Conf. Res. Development Inf. Retrieval, 2008.
[24] M. Jansche and S. Abney, “Information extraction from voicemail transcripts,” in Proc. EMNL, 2002.
[25] A. Inoue, T. Mikami, and Y. Yamashita, “Improvement of speech summarization using prosodic information,” in Proc. ICASSP, 2004, pp. 599–602.
Full Text: PDF


