f_max t1_jbze2pl wrote on March 12, 2023 at 10:15 PM

Speaking as someone working on scaling beyond gpt3 sizes, I think if there was proof of existence of human level ai at 100T parameters, then people would put down the money today to do it. It’s roughly $10m to train a 100B model. With rough scaling of cost with param size, it’s $10B to train this hypothetical 100T param ai. That’s the cost of buying a large tech startup. But a human level ai is probably worth more than all of big tech combined. The main thing stopping people is no one knows if the scaling curves will bend and we’ll hit a plateau in improvement with scale, so no one has the guts to put the money down.

-Rizhiy- t1_jbzfsqt wrote on March 12, 2023 at 10:27 PM

> human level ai is probably worth more than all of big tech combined

What makes you say that? Where is the economic reasoning? For the vast majority of jobs human labour costs ~$10/hour, a 100T model will most likely cost much more to run. There is a lot of uncertainty with whether the current LLMs can be profitable.

I would say that actually the main reason stopping training of even larger LLMs, is that the economic model is not figured out yet.

charlesrwest t1_jbz1w0u wrote on March 12, 2023 at 8:49 PM

Isn't that more or less what GPT-3 was? As I recall, most of the really big models are costing millions to train?

Mayfieldmobster t1_jbzspe0 wrote on March 13, 2023 at 12:03 AM

There are tools that allow you to train models of very large sizes on much smaller hardware like colossal ai

Username912773 t1_jbz7y8x wrote on March 12, 2023 at 9:32 PM

LLMs cannot be sentient as they require input to generate an output and do not have initiative. They are essentially giant probabilistic networks that calculate the probability of the next token or word.

As you scale model size up, not only do you need more resources to train it but you also require more time and data to train it. So, why would anyone just “screw it” and spend potentially millions or billions of dollars on something that may or may not work and almost certainly have little monetary return?

TemperatureAmazing67 t1_jbzc8cc wrote on March 12, 2023 at 10:02 PM

'require input to generate an output and do not have initiative' - use random or other's network output.

Also, the argument about next token is skrewed up. For a lot of task everything you need is perfectly predicted next token.

Username912773 t1_jbze0ug wrote on March 12, 2023 at 10:14 PM

That’s not a solution. That doesn’t make LLMs sentient it just makes them a cog in a larger machine.

Logic and task performance and sentience are different.

MinaKovacs t1_jbyzv1v wrote on March 12, 2023 at 8:34 PM

A binary computer is nothing more than an abacus. It doesn't matter how much you scale up an abacus, it will never achieve anything even remotely like "intelligence."

RedditLovingSun t1_jbz78cm wrote on March 12, 2023 at 9:27 PM

Depends on your definition of intelligence, the human brain is nothing but a bunch of neurons passing electrical signals to each other, I don't see why it's impossible for computers to simulate something similar to achieve the same results as a brain does.

MurlocXYZ t1_jbzk75t wrote on March 12, 2023 at 10:59 PM

> A binary computer is nothing more than an abacus

I could say the same thing about the human brain. It's just a complex abacus.

MinaKovacs t1_jbzso7m wrote on March 13, 2023 at 12:03 AM

One of the few things we know for certain about the human brain is it is nothing like a binary computer. Ask any neuroscientist and they will tell you we still have no idea how the brain works. The brain operates at a quantum level, manifested in mechanical, chemical, and electromagnetic characteristics, all at the same time. It is not a ball of transistors.

hebekec256 OP t1_jbz0mpm wrote on March 12, 2023 at 8:40 PM

Yes, I understand that. but LLMs and extensions of LLMs (like PALM-E) are a heck of a lot more than an abacus. I wonder what would happen if Google just said, "screw it", and scaled it from 500B to 50T parameters. I'm guessing there are reasons in the architecture that it would just break, otherwise I can't see why they wouldn't do it, since the risk to reward ratio seems favorable to me

TemperatureAmazing67 t1_jbzcn6a wrote on March 12, 2023 at 10:05 PM

>extensions of LLMs (like
>
>PALM-E
>
>) are a heck of a lot more than an abacus. I wonder what would happen if Google just said, "screw it", and scaled it from 500B to 50T parameters. I'm guessing there are reasons in the architecture that it would

The problem is that we have scaling laws for NN. We just do not have the data for 50T parameters. We need somehow to get these data. The answer on this question costs a lot.

Co0k1eGal3xy t1_jbzi8wc wrote on March 12, 2023 at 10:45 PM

Double Decent, more parameters are MORE data efficient.
Most of these LLMs barely complete 1 epoch, so there is no concern about overfitting currently.

MinaKovacs t1_jbz2gqw wrote on March 12, 2023 at 8:53 PM

I think the math clearly doesn't work out; otherwise, Google would have monetized it already. ChatGPT is not profitable or practical for search. The cost of hardware, power consumption, and slow performance are already at the limits. It will take something revolutionary, beyond binary computing, to make ML anything more than expensive algorithmic pattern recognition.

[D] Is anyone trying to just brute force intelligence with enormous model sizes and existing SOTA architectures? Are there technical limitations stopping us?

Comments