Print Email Facebook Twitter Modelling context in automatic speech recognition Title Modelling context in automatic speech recognition Author Wiggers, P. Contributor Rothkrantz, L.J.M. (promotor) Koppelaar, H. (promotor) Faculty Electrical Engineering, Mathematics and Computer Science Date 2008-06-04 Abstract Speech is at the core of human communication. Speaking and listing comes so natural to us that we do not have to think about it at all. The underlying cognitive processes are very rapid and almost completely subconscious. It is hard, if not impossible not to understand speech. For computers on the other hand, recognising speech is a daunting task. It has to deal with a large number of different voices "influenced, among other things, by emotion, moods and fatigue" the acoustic properties of different environments, dialects, a huge vocabulary and an unlimited creativity of speakers to combine words and to break the rules of grammar. Almost all existing automatic speech recognisers use statistics over speech sounds "what is the probability that a piece of audio is an a-sound" and statistics over word combinations to deal with this complexity. The results of those systems are impressive but unfortunately not good enough for most applications of speech recognition. This thesis proposes to put context information in the models of speech recognition to achieve better recognition results. Context is defined as knowledge of the speaker, such as gender and dialect, knowledge of the conversation and knowledge of the world. The influence of each of those categories is investigated using data analysis and case studies and new models for speech recognition are defined. In particular, a model that dynamically adapts the vocabulary of the recogniser to the topic of a conversation, which it can automatically determine, is presented. Subject automatic speech recognitionlanguage modellingdynamic Bayesian networks To reference this document use: http://resolver.tudelft.nl/uuid:78bf003b-e784-40ca-888c-48e0246b3883 Part of collection Institutional Repository Document type doctoral thesis Rights (c) 2008 P. Wiggers Files PDF wiggers_20080604.pdf 3.69 MB Close viewer /islandora/object/uuid:78bf003b-e784-40ca-888c-48e0246b3883/datastream/OBJ/view