Training and testing the TDNN-OPGRU acoustic model on English read and spontaneous speech

Genkov, Georgi

Training and testing the TDNN-OPGRU acoustic model on English read and spontaneous speech

Title

Training and testing the TDNN-OPGRU acoustic model on English read and spontaneous speech

Author

Genkov, Georgi (TU Delft Electrical Engineering, Mathematics and Computer Science; TU Delft Intelligent Systems)

Contributor

Feng, S. (mentor)
Scharenborg, O.E. (graduation committee)
Jonker, C.M. (graduation committee)

Degree granting institution

Delft University of Technology

Programme

Computer Science and Engineering

Project

CSE3000 Research Project

Date

2021-07-01

Abstract

Automatic phoneme recognition (APR) is the process of recognizing phonemes (spoken sounds) in a recording of speech. It can be used for any application requiring fast and accurate transcription, i.e. a courthouse. This research creates such a model using the TDNN-OPGRU architecture and trains it on two datasets of recorded English speech - "TIMIT" for prewritten sentences being read out (prepared/read speech) and "Buckeye" for recorded interviews (spontaneous speech). The results of the model are analyzed and compared to similar research. The main conclusion is that the results obtained do not exceed previous research and in some cases are considerably worse. The reasoning for that is also included.

Subject

Phoneme Recognition
Phoneme Error Rate
Acoustic Model
TDNN-OPGRU
English
Prepared speech
Spontaneous speech

To reference this document use:

http://resolver.tudelft.nl/uuid:350beee7-6bca-41c8-823c-dffd584736eb

Part of collection

Student theses

Document type

bachelor thesis

Rights

Files

PDF

RP_1_.pdf

340.48 KB

Close viewer