Comments

You must log in or register to comment.

f_max t1_jbze2pl wrote

Speaking as someone working on scaling beyond gpt3 sizes, I think if there was proof of existence of human level ai at 100T parameters, then people would put down the money today to do it. It’s roughly $10m to train a 100B model. With rough scaling of cost with param size, it’s $10B to train this hypothetical 100T param ai. That’s the cost of buying a large tech startup. But a human level ai is probably worth more than all of big tech combined. The main thing stopping people is no one knows if the scaling curves will bend and we’ll hit a plateau in improvement with scale, so no one has the guts to put the money down.

5

-Rizhiy- t1_jbzfsqt wrote

> human level ai is probably worth more than all of big tech combined

What makes you say that? Where is the economic reasoning? For the vast majority of jobs human labour costs ~$10/hour, a 100T model will most likely cost much more to run. There is a lot of uncertainty with whether the current LLMs can be profitable.

I would say that actually the main reason stopping training of even larger LLMs, is that the economic model is not figured out yet.

0

charlesrwest t1_jbz1w0u wrote

Isn't that more or less what GPT-3 was? As I recall, most of the really big models are costing millions to train?

4

Mayfieldmobster t1_jbzspe0 wrote

There are tools that allow you to train models of very large sizes on much smaller hardware like colossal ai

1

Username912773 t1_jbz7y8x wrote

LLMs cannot be sentient as they require input to generate an output and do not have initiative. They are essentially giant probabilistic networks that calculate the probability of the next token or word.

As you scale model size up, not only do you need more resources to train it but you also require more time and data to train it. So, why would anyone just “screw it” and spend potentially millions or billions of dollars on something that may or may not work and almost certainly have little monetary return?

−2

TemperatureAmazing67 t1_jbzc8cc wrote

'require input to generate an output and do not have initiative' - use random or other's network output.

Also, the argument about next token is skrewed up. For a lot of task everything you need is perfectly predicted next token.

2

Username912773 t1_jbze0ug wrote

That’s not a solution. That doesn’t make LLMs sentient it just makes them a cog in a larger machine.

Logic and task performance and sentience are different.

1

MinaKovacs t1_jbyzv1v wrote

A binary computer is nothing more than an abacus. It doesn't matter how much you scale up an abacus, it will never achieve anything even remotely like "intelligence."

−8

RedditLovingSun t1_jbz78cm wrote

Depends on your definition of intelligence, the human brain is nothing but a bunch of neurons passing electrical signals to each other, I don't see why it's impossible for computers to simulate something similar to achieve the same results as a brain does.

10

MurlocXYZ t1_jbzk75t wrote

> A binary computer is nothing more than an abacus

I could say the same thing about the human brain. It's just a complex abacus.

1

MinaKovacs t1_jbzso7m wrote

One of the few things we know for certain about the human brain is it is nothing like a binary computer. Ask any neuroscientist and they will tell you we still have no idea how the brain works. The brain operates at a quantum level, manifested in mechanical, chemical, and electromagnetic characteristics, all at the same time. It is not a ball of transistors.

0

hebekec256 OP t1_jbz0mpm wrote

Yes, I understand that. but LLMs and extensions of LLMs (like PALM-E) are a heck of a lot more than an abacus. I wonder what would happen if Google just said, "screw it", and scaled it from 500B to 50T parameters. I'm guessing there are reasons in the architecture that it would just break, otherwise I can't see why they wouldn't do it, since the risk to reward ratio seems favorable to me

0

TemperatureAmazing67 t1_jbzcn6a wrote

>extensions of LLMs (like
>
>PALM-E
>
>) are a heck of a lot more than an abacus. I wonder what would happen if Google just said, "screw it", and scaled it from 500B to 50T parameters. I'm guessing there are reasons in the architecture that it would

The problem is that we have scaling laws for NN. We just do not have the data for 50T parameters. We need somehow to get these data. The answer on this question costs a lot.

3

Co0k1eGal3xy t1_jbzi8wc wrote

  1. Double Decent, more parameters are MORE data efficient.
  2. Most of these LLMs barely complete 1 epoch, so there is no concern about overfitting currently.
1

MinaKovacs t1_jbz2gqw wrote

I think the math clearly doesn't work out; otherwise, Google would have monetized it already. ChatGPT is not profitable or practical for search. The cost of hardware, power consumption, and slow performance are already at the limits. It will take something revolutionary, beyond binary computing, to make ML anything more than expensive algorithmic pattern recognition.

−1