Viewing a single comment thread. View all comments

gaudiocomplex t1_j4xkj75 wrote

It may be multimodal. And that may have been the difference in achieving some semblance of AGI. That is 100% speculation, but I worked with an NLP for a long time that focused on human level metadata editing of sound files at scale. There is plenty of data out there to feed into the machine.

But on a more certain level, you have to realize that language itself models reality and LLM's when they are able to more accurately model language itself, they're able to produce a more real reality. Some of the things that is doing right now in terms of errors and dumb mistakes, those won't be happening anymore. We will have a lot more difficult of the time sussing out what's real and what's not. The banal ways that it communicates now... I don't think that that will be the case either.

16

Northcliff t1_j4zmasl wrote

It’s 100% definitely not multimodal

The level of making shit up in this sub is astronomical

12