Making Czech Historical Radio Archive Accessible and Searchable for Wide Public
Abstract
Keywords
References
[1] J. H. L. Hansen, J. Deller, and M. Seadle, “Engineering challenges in the creation of a National Gallery of the Spoken Word: Transcript-free search of audio archives,” in Proc. IEEE ACM Joint Conf. Digital Libraries, Roanoke, VA, Jun. 2001, pp. 235–236.
[2] J. H. L. Hansen, R. Huang, B. Z. M. Seadle, J. J. R. Deller, A. R. Gurijala, M. Kurimo, and P. Angkititrakul, “Speechfind: Advances in spoken document retrieval for a national gallery of the spoken word”, IEEE Trans. Speech Audio Processing, vol. 13, no. 5, pp.712 - 730, 2005.
http://dx.doi.org/10.1109/TSA.2005.852088
[3] R.J.F. Ordelman, F.M.G. de Jong, and W.F.L. Heeren. “Exploration of audiovisual heritage using audio indexing technology“, Proc. of 1st workshop on intelligent technologies for cultural heritage exploitation, Trento, Italy, Sept 2006, pp 36–39.
[4] W. Byrne, D. Doermann, M. Franz, S. Gustman, J. Hajic, D. Oard, M. Picheny, J. Psutka, B. Ramabhadran, D. Soergel, T. Ward, and W.-J. Zhu, “Automatic recognition of spontaneous speech for access to multilingual oral history archives“, IEEE Trans. Speech Audio Process., vol. 12, no. 4, pp.420 - 435, 2004.
http://dx.doi.org/10.1109/TSA.2004.828702
[5] J. Nouza, J. Zdansky, P. Cerva, J. Kolorenc, “A system for information retrieval from large records of Czech spoken data”. In: Text, Speech and Dialogue, Lecture Notes in Computer Science, LNCS (LNAI), vol. 4188, Springer, Heidelberg (2006), pp. 401-408.
[6] J. Nouza, J. Zdansky, P. Cerva, “System for automatic collection, annotation and indexing of Czech broadcast speech with full-text search” Proc. of 15th IEEE MELECON conference, Malta, April 2010, pp. 202-205
[7] J. Nouza, K. Blavka, M. Bohac, P. Cerva, J. Zdansky, J. Silovsky, and J. Prazak, “Voice Technology to Enable Sophisticated Access to Historical Audio Archive of the Czech Radio”, In Multimedia for Cultural Heritage. Communications in Computer and Information Science. Springer Berlin Heidelberg, 2012, vol. 247, pp.27-38.
[8] FFmpeg converter available at http://www.ffmpeg.org/
[9] J. Nouza, D. Nejedlova, J. Zdansky, J. Kolorenc, "Very large vocabulary speech recognition system for automatic transcription of Czech broadcast programs", In Proc. of Interspeech-2004, Jeju, Korea, Oct. 2004, pp. 409-412.
[10] J. Nouza, J. Silovsky, J. Zdansky, P. Cerva, M. Kroul, J. Chaloupka, "Czech-to-Slovak adapted broadcast news transcription system", In Proc. of Interspeech-2008, Brisbane, Australia, Sept. 2008, pp. 2683-2686.
[11] P. Cerva, K. Palecek, J. Silovsky, J. Nouza, “Using Unsupervised Feature-Based Speaker Adaptation for Improved Transcription of Spoken Archives”, in Proc. of Interspeech-2011, Florence, Italy, August 2011, pp. 2565- 2568
[12] M. Bohac, K. Blavka, “Automatic segmentation and annotation of audio archive documents”, In Proc. of ECMS-2011, Liberec, Czech Rep., May 2011, pp. 1 - 6
[13] V. Hanzl, P. Pollak, “Accuracy Analysis of Generalized Pronunciation Variant Selection in ASR Systems”. In: Cross-Modal Analysis of Speech, Gestures, Gaze and Facial Expressions LNCS, vol. 5641, Springer Heidelberg (2009), pp. 399-408.
[14] S. Chen and P. Gopalakrishnan, “Speaker, environment and channel change detection and clustering via the bayesian information criterion,” in Proc. of 1998 DARPA Broadcast News Transcription and Understanding Workshop, 1998, pp. 127–132.
[15] J. Silovsky, J. Prazak, P. Cerva, J. Zdansky, J. Nouza, “PLDA-based Clustering for Speaker Diarization of Broadcast Streams”, in Proc. of Interspeech-2011, Florence, Italy, August 2011, pp. 2909-2912
[16] J. Silovsky, P. Cerva, J. Zdansky, “Comparison of Generative and Discriminative Approaches for Speaker Recognition with Limited Data”, Radioengineering, Vol. 18, pp. 307-316, 2009.
[17] J. Navratil, “Spoken language recognition - a step toward multilinguality in speech processing”, IEEE Transactions on Speech and Audio Processing, vol. 9, pp. 678-685, Sept. 2001.
http://dx.doi.org/10.1109/89.943345
[18] MySQL platform available at http://www.mysql.com/
[19] SPHINX platform available at http://sphinxsearch.com/
[20] Demo of APAP available at http://ahmed.ite.tul.cz/demo/
[21] L. Lamel, J.-L. Gauvain, “Speech processing for audio indexing”. In Advances in Natural Language Processing, LNCS Springer Heidelberg (2008), vol. 5221, pp. 4-15.
http://dx.doi.org/10.1007/978-3-540-85287-2_2
[22] P. Cerva, J. Nouza, J. Silovsky, “Study on Cross-lingual Adaptation of a Czech LVCSR System towards Slovak”. In Analysis of Verbal and Nonverbal Communication and Enactment: The Processing Issues. LNCS, vol. 6800, Springer Heidelberg (2011), pp. 81-87.
[23] Transcriber - a tool for segmenting, transcribing speech: http://trans.sourceforge.net/en/presentation.php
[24] M. Huijbregts, R. Ordelman, F. de Jong, "Speech-based Annotation of Heterogeneous Multimedia Content Using Automatic Speech Recognition". CTIT-technical Report, May 2007.
[25] P. Pollak, M. Behunek, "Accuracy of MP3 speech recognition under real-word conditions: Experimental study", Proc. SIGMAP 2011, Seville, July 2011, pp. 5-10
Full Text: PDF


