Viewing a single comment thread. View all comments

androiddrew t1_iyeipwa wrote

I think there is a lot of room for a better virtual assistant experience. Remember these devices came on the market almost 10 years ago. And the experience they made was really request/response . Models for ASR, NLP, TTS have gotten a lot better. Toss in a non-toxic large language model (GPT3 but better) and the experience of these assistants haven’t really kept up with the times.

I think there is a market for a higher end Virtual Assistant. One that uses a Metahuman avatar and a more modern model stack to allow for a much more human like interaction with the machine. The big gap I see is the lack of a context based memory in that type of interaction. The movie “Her” did a great job with the idea that the VA learned who you are and had records “a memory” of your interactions with it. I know channel theory isn’t popular but a visual channel in the experience could also improve the interaction, but you have to cross the uncanny valley. Thats where I think a Meta Human avatar could fit. We can curate an animated experience that seems much more life like too, given enough time.

To solve the security issue you can have a edge device like a Jetson Orin host all the data and models locally(except a LLM ). Or hell just allow it to run on a PC locally. Then offer a cloud product for people that don’t care. I am actually working on a Jetson Orin based stack to address both latency and security…it just costs $1200 for the chip module alone. So these devices aren’t cheap if you need it on the edge.

On the topic of cost these big companies wanted these mainstream in the home so they sold them at a loss. I don’t think you can do that with the high edge solution I mentioned above.

2