-
Notifications
You must be signed in to change notification settings - Fork 614
Open
Description
I retrained a HiFi-GAN vocoder using data sampled at 16 kHz and replaced the original generator_universal.pth.tar in the FastSpeech2 project with the resulting checkpoint (g_00150000). Subsequently, FastSpeech2 was trained on the same dataset. However, during training, the audio samples logged in TensorBoard exhibited noticeable issues, including speaker drift and missing or dropped phonetic segments. Is there any insight into the possible causes of these issues?
Metadata
Metadata
Assignees
Labels
No labels