Evaluation of instrumental measures for the prediction of musical noise in enhanced noisy speech

To obtain the absolute truth about the performance of a noise reduction method one requires to perform a listening experiment. As listening experiments are often time consuming and expensive there exists a need to replace these experiments by instrumental measures. Consequently, research has provided various instrumental measures which can been used in order to predict speech-quality or-speech intelligibility. The aim of the present work is to evaluate the performance of a broad range of established instrumental measures in terms of their ability to predict the amount of musical noise present in enhanced noisy speech signals. The performance of the instrumental measures is evaluated using musical noise quantity scores obtained from a specially designed listening experiment which was performed by normal-hearing listeners. The investigated stimuli, which contain various amounts contain musical noise, are produced using the spectral subtraction noise reduction method. Of all considered standard measures, a mean squared distortion measure, a SNR based method, the PESQ measure, and the STOI measure yield the highest correlations with the listening experiments scores. These results confirm the ability of instrumental measures to predict the amount of musical noise, but further evaluation shows limitations to their applicability as the results suggest that optimization of the over-subtraction parameter for a minimum amount of musical noise and maximal speech-quality or intelligibly simultaneously, is not possible. Instead the results show that maximal speech-quality or intelligibility is obtained when a stimulus contains the highest amount of musical noise. To gain more insight of the amount of musical noise in a stimulus, a novel measure, based on the characteristics of musical noise in time and frequency, is proposed. This measure incorporates a parametric outlier detection method to classify musical components. High correlations with the outcome of the listening experiment are obtained, i.e. rho = 0.90 for enhanced noisy speech signals with various input SNR.

Subject

instrumental measure
kurtosis
listening experiment
outlier detection
musical noise

To reference this document use:

http://resolver.tudelft.nl/uuid:e7e6419e-877d-4874-a212-993e4fd4f568

Embargo date

2020-01-01

Part of collection

Student theses

Document type

master thesis

Rights

Files

PDF

Thesis_Martijn_Gerrits_.pdf

636.4 KB

Close viewer