Print Email Facebook Twitter Using cross-model learnings for the Gram Vaani ASR Challenge 2022 Title Using cross-model learnings for the Gram Vaani ASR Challenge 2022 Author Patel, T.B. (TU Delft Multimedia Computing) Scharenborg, O.E. (TU Delft Multimedia Computing) Date 2022 Abstract In the diverse and multilingual land of India, Hindi is spoken as a first language by a majority of its population. Efforts are made to obtain data in terms of audio, transcriptions, dictionary, etc. to develop speech-technology applications in Hindi. Similarly, the Gram-Vaani ASR Challenge 2022 provides spontaneous telephone speech, with natural back-ground and regional variations in Hindi. The challenge provides: 100 hours of labeled train-set, 5 hours of labeled dev-set and 1000 hours of unlabeled data-set. For the 'Closed Challenge', we trained an End-to-End (E2E) Conformer model using speed perturbations, SpecAugment techniques and use VTLN to handle any unknown speaker groups in the blind evaluation set. On the dev-set, we achieved a 30.3% WER compared to the 34.8% WER by the Challenge E2E baseline. For the 'Self Supervised Closed Challenge', a semi-supervised learning approach is used. We generate pseudo-transcripts for the unlabeled data using a hybrid TDNN-3gram LM model and trained an E2E model. This is then used as a seed for retraining the E2E model with high confidence data. Cross-model learning and refining of the E2E model gave 25.3% WER on the dev-set compared to ∼33-35% WER by the Challenge baseline that use wav2vec models. Subject cross-architecture learningend-to-end ASRGram-Vaani Challengehybrid ASRsemi-supervised learning To reference this document use: http://resolver.tudelft.nl/uuid:c3fe23e3-a025-49a6-953c-30349cae003a DOI https://doi.org/10.21437/Interspeech.2022-10639 Embargo date 2023-07-01 ISSN 2308-457X Source Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2022-September, 4880-4884 Event 23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022, 2022-09-18 → 2022-09-22, Incheon, Korea, Republic of Bibliographical note Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public. Part of collection Institutional Repository Document type journal article Rights © 2022 T.B. Patel, O.E. Scharenborg Files PDF patel22_interspeech.pdf 533.97 KB Close viewer /islandora/object/uuid:c3fe23e3-a025-49a6-953c-30349cae003a/datastream/OBJ/view