soobardo

soobardo t1_japo5w5 wrote

Yes, they pair up perfectly. Whisper detects anything I babble to it, english or french and it's surprisingly fast. I've wrapped a loop that:

listens micro -> whisper STT -> chatgpt -> lang detect -> Google TTS -> speaker

With noise/silence detection, it's a complete hands-off experience, like chatting with a real person. Delay is ~ 5s for all calls. "Glueing" the APIs is straightforward and intuitive.

2