Poorfocus t1_jdugyhm wrote

Yeah, it’s called Elevenlabs, a bit cheaper than that for the first tier and it’s really fantastic, I tested it out by recording some friends have natural conversations speaking directly into their microphones on discord (w/ consent!)

One thing is you have to turn down the stability parameter very low from the default or else the intonation is very stiff and it sounds robotic, bring it down to 20% and generate a few times but when it gets the likeness right it’s perfect. To the point where even the person it was emulating found it convincing

I’m curious how it handles the reading, since it’s obviously context aware to some extent. I think we as humans are very keen to picking out “poor acting” and unnatural vocal delivery. I think when that gets improved, we’ll have completely natural language conversations with the ai.