Viewing a single comment thread. View all comments

NWCoffeenut t1_jdsgb83 wrote

I think a good part of the latency was with the TTS system. The actual text response for the most part came back reasonably quickly.

26

illathon t1_jdsoud8 wrote

No most implementations of whisper are slow.

2

itsnotlupus t1_jdt280v wrote

Whisper is the speech recognition component.
I don't think he said what he's using for TTS, might be MacOS' builtin thingy.

4

eggsnomellettes t1_jdt5dxl wrote

They're using elevenlabs, which isn't local and hence a slow API call

11

tortoise888 t1_jdtp8yj wrote

If we eventually get open source Elevenlabs quality models running locally it's gonna be insane.

1

ebolathrowawayy t1_jdvfmrk wrote

There's also Tortoise TTS which can be run locally but idk how fast it is.

1