gunshoes t1_j7nj5co wrote on February 8, 2023 at 2:00 AM

Fast speech 2 would be your best bet.

nmfisher t1_j7osgdc wrote on February 8, 2023 at 9:41 AM

FS2 is fine for training a TTS model from scratch, but I haven't come across a good FS2 model for cloning (which is basically zero-shot TTS).

You can throw GasTs or use a speaker embedding to influence the energy/ pitch outputs. The sound is meh but it works.

That's why I added the qualifier "good" :)