-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
bugSomething isn't workingSomething isn't working
Milestone
Description
Bug description
If your text includes OOVS, like digits, typos or unknown words (where unknown words are those not in the CMU dict use to build the g2p English mapping), they are simply stripped out of the text before training or synthesis if you are converting to phones.
E.g., testing 123 testings test gets g2p'd to tɛstɪŋ tɛst which is not great for training, and potentially catastrophic for synthesis.
When a given utterance constitutes exclusively of OOVs, you get a stack dump as described in #741.
This problem was noticed by @marctessier a few weeks ago.
Possible suggestions by @roedoejet
- fall back to und, like readalongs does?
- fall back to a neural g2p model?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working