Viewing a single comment thread. View all comments

DamienLasseur t1_j379x65 wrote

However, the hardware is insanely expensive to train the model and run inference. If this were to work, we'd need someone with access to a lot of cloud computing/supercomputer/Google TPU's. The ChatGPT model alone requires ~350GB of GPU memory to generate an output (essentially performing inference). So imagine a model capable of all that and more? It'd require a lot of compute power.

10

4e_65_6f t1_j37ambo wrote

>The ChatGPT model alone requires ~350GB of GPU memory to generate an output (essentially performing inference). So imagine a model capable of all that and more? It'd require a lot of compute power.

I didn't say "try training LLM's on your laptop". I know that's not feasible.

The point of trying independently is to do something different than what they're doing. You're not supposed to copy what it's being done already. You're supposed to try to code what you think would work.

Because, well LLM's aren't AGI and we don't know yet if they will ever be.

1

DamienLasseur t1_j37b4sv wrote

Proto-AGI may likely be a multimodal system and therefore will include some sort of variant of transformers for language if developed within the next 5 years or so (in addition to other NN architectures)

5