Viewing a single comment thread. View all comments

Nervous-Newt848 t1_ja5miqj wrote

First off... Very interesting... But just so you know that wouldnt be a language model anymore

They dont really have a term for that other than multimodal... Multimodal world model???

Models cant speak or hear when they want to Its just not part of their programming

They respond to input

So if they are receiving continuous input... Theoretically they should be continuously outputting...

The whole conversation history could be saved into a database

Reward models are currently given texts with scores made by humans its called RLHF, or Reinforcement Learning from Human Feedback... AI doesnt do the scoring... That's for language models though...

How could they know what is good and what is bad???

Now for world models reinforcement learning works differently... Which is probably what youre referring to... I wont go into it because its pretty complex...

Updating its weights continuously is currently impossible due to an energy inefficiency problem with the von Neumann hardware architecture... Basically traditional cpus and gpus... More basically, it requires too many computations and too much electricity to continuously "backpropagate" (data science word) data input...

Conversations shouldn't be encoded into a language model either imo... because of "hallucinations" they may make some things up that didn't happen

Querying a database of old conversations is better and will always be more accurate

In order for an AGI to truly be AGI by definition it needs to be able to learn any task... This is currently possible manually server side through manual backpropagation... But this is not possible continuously like how human brains work...

Humans continuously learn...

An AI neural network manually learns by being fed data through a command line interface... This is called "training"... Data science terminology

An AI neural network model is then "deployed" aka opened and ran on a single gpu or multiple depending on model size... When a language model is running it is said to be in "inference mode"... More terminology

We need an entirely different hardware architecture in order to run AI Neural Networks in training and Inference mode simultaneously...

Photonics or Neuromorphic computing, perhaps a combination of both... These seem like the way forward in my opinion

2

turnip_burrito t1_ja5y3hb wrote

I agree with all of this, but just to be a bit over-pedantic on one bit:

> Models cant speak or hear when they want to Its just not part of their programming.

As you said it's not part of their programming, in today's models. In general though, it wouldn't be too difficult to construct a new model that judges at each timestep based on both external stimuli and internal hidden states when to speak/interrupt or listen intently. Actually at first glance such a thing sounds trivial.

1