Viewing a single comment thread. View all comments

Atom_101 t1_j3vwoid wrote

Yeah. It supports zero shot voice cloning using a reference clip.

5

Elleo t1_j3w67mx wrote

It's worth noting that it's still heavily influenced by whatever the initial training data is. I had a play with the model here: https://replicate.com/afiaka87/tortoise-tts and everything comes out with an American accent.

3

CeFurkan OP t1_j3w9hu6 wrote

Are you able to generate speech based on given timings like providing a str, vtt file or convert speech audio into equivalent timed speech?

​

ty so much for answers.

1