Print Email Facebook Twitter Speech Production Modelling and Analysis Title Speech Production Modelling and Analysis Author Koutrouvelis, A.I. Contributor Heusdens, R. (mentor) Faculty Electrical Engineering, Mathematics and Computer Science Department Microelectronics Date 2014-07-17 Abstract The first part of the present thesis reviews the speech production mechanism and several models of the glottal flow derivative waveform and of the vocal tract filter. The source filter model is investigated in depth, since it is the most important "ingredient" of linear prediction analysis. We also review seven linear prediction (LP) methods based on the same general LP optimization framework. Moreover, we examine the importance of pre-emphasis and glottal-cancellation prior to LP. The second part of the thesis, provides an experimental evaluation of the LP methods combined with several pre-emphasis and glottal-cancellation techniques in the context of two general application areas. The first area consists of applications which aim to estimate the true glottal flow or glottal flow derivative signal. The second area consists of applications which aim to find a sparse residual. In particular, five factors are investigated: the sparsity of the residual using the Gini index, the estimation accuracy of the glottal flow derivative using the signal to noise ratio (SNR), the estimation accuracy of the vocal tract spectral magnitude using the log spectral distortion distance (LSD) metric, and the probability of obtaining a stable linear prediction filter. All these factors are evaluated for clean and reverberated speech signals. The sparse linear prediction methods and the iteratively reweighted least squares method combined with the second order pre-emphasis filter give the most accurate glottal flow derivative estimates, the most accurate vocal tract estimates and the sparsest residuals in most cases. Finally, we compare several linear prediction methods in the context of the speech dereverberation method proposed in [1, 2]. This method enhances the reverberated residual obtained via the autocorrelation method. In the context of this application, we show that the sparse linear prediction method and the weighted linear prediction method combined with a second-order pre-emphasis filter perform better than the autocorrelation method. Subject linear predictionsparsityreverberationglottal flowvocal tractpre-emphasisglottal-cancellation To reference this document use: http://resolver.tudelft.nl/uuid:d9124f35-4e2c-4b4b-95e3-6e3e0bf9a7b8 Part of collection Student theses Document type master thesis Rights (c) 2014 Koutrouvelis, A.I. Files PDF mscThesis.pdf 4.87 MB Close viewer /islandora/object/uuid:d9124f35-4e2c-4b4b-95e3-6e3e0bf9a7b8/datastream/OBJ/view