Skip to content

ravi-devgoblet/iisc_cohort4_group6_capstone

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Training a speech recogniser using data from speech synthesis

Domain: Audio Dr. Prasanta Kumar Ghosh (prasantg@iisc.ac.in)

Short Description:

Generally, training automatic speech recognition (ASRs) systems require paired data of speech and text. In this problem statement, you will be training an ASR with only text. This is done by using a pre-trained multispeaker text-to-speech (TTS) model to generate this. The generated speech, along with the corresponding text, is used to train ASR. The trained ASR will be evaluated on unseen sentences for seen and unseen speakers from the multi-speaker TTS system.

Pretrained multispeaker TTS using coqui-ai: https://huggingface.co/projecte-aina/tts-ca-coqui-vits-multispeaker

Training ASR using speechbrain: https://colab.research.google.com/drive/1aFgzrUv3udM_gNJNUoLaHIm78QHtxdIz?usp=sharing

Reference(s): 1. https://arxiv.org/pdf/2306.00998.pdf

About

iisc_cohort4_group6_capstone

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •