Viewing a single comment thread. View all comments

drekmonger t1_j9hvs1w wrote

Number of parameters is not the whole story. Quality of training material and training time and training techniques matter as much or more.

The larger models require more resources for inference, as well. I'd be more impressed by a model smaller than GPT-3 that performed just as well.

110

Hands0L0 t1_j9i277j wrote

I for one welcome competition in the race to AGI

43

xott t1_j9i2zg3 wrote

It's the new Space Race

9

phoenixmusicman t1_j9imtgj wrote

It's not a space race until Governments start pouring massive amounts of their GDP into it.

1

Artanthos t1_j9l2n4a wrote

China is pouring money into AI research.

1

spreadlove5683 t1_j9i58z2 wrote

I sure don't. We need to get it right. Not barrel ahead in an arms race.

9

amplex1337 t1_j9ir5dt wrote

You know the closer we get to AGI the more that will happen. Every government will want to be the first in control of an ASI which will basically make them the dominant superpower of the world. It will be as dystopian as it sounds.

2

beachmike t1_j9k5205 wrote

There will be both good and bad that comes out as we get closer to AGI, and attain AGI, just like and other technological revolution. To paint it as "either" dystopian or utopian is naive.

1

Artanthos t1_j9l3i0n wrote

It depends. We cannot see the other side of a singularity.

We could have an alignment issue and end up as paper clips.

AI could solve everything from climate change to social inequality by reducing the human race to 50 million Stone Age hunter gatherers.

Or, you could have the top 1% living in a utopia while everyone else is living in a dystopia.

1

dangeratio t1_j9igzb4 wrote

Check out Amazon’s multimodal chain of thought model, only 738 million and scores better on all question classes than ChatGPT. See table 4 on page 7 here - https://arxiv.org/pdf/2302.00923.pdf

20

Destiny_Knight t1_j9iupzk wrote

What the actual fuck is that paper? The thing performed better than a human at several different question classes.

At fucking less than one billion parameters. 100x less than GPT 3.5.

Edit: For clarity, I am impressed not angry lol.

12

IluvBsissa t1_j9j5t08 wrote

Are you angry or impressed ?

3

Destiny_Knight t1_j9j6iq0 wrote

impressed lol

2

IluvBsissa t1_j9j6v5v wrote

If these models are so smol and efficient, why are they not released ?? I just don't get it. I thought PaLM was kept private because it was too costly to run to be profitable...

3

kermunnist t1_j9kqsaw wrote

That's because the smaller models are less useful. With neural networks (likely including biological ones) there's a hard trade off between specialized performance and general performance. If these 100+x smaller models were trained on the same data as GPT-3 they would perform 100+x worse on these metrics (maybe not exactly because in this case the model was multimodal which definitely gave a performance advantage). The big reason this model performed so much is because it was fine tuned on problems similar to the ones on this exam where as GPT-3 was fine turned on anything and everything. This means that this model would likely not be a great conversationalist and would probably flounder at most other tasks GPT-3.5 does well on.

5

drekmonger t1_j9iios3 wrote

Heh. I tried their rationalization step with ChatGPT, just with prompting. For their question about the fries and crackers it said the problem is flawed, because there's such a thing as crackers with low or no salt. Also correctly inferred that fries are usually salted, but don't have to be. (of course, it didn't have the picture to go by, which was the point of the research)

Great paper though. Thanks for sharing.

8

challengethegods t1_j9i1lk3 wrote

>I'd be more impressed by a model smaller than GPT-3 that performed just as well.

from the article: "Aleph Alpha’s model is on par with OpenAI’s GPT-3 davinci model, despite having fewer parameters.", so... you're saying you would be even more impressed if it used even fewer parameters? Anyway I think anyone could guess gpt3 is poorly optimized so it shouldn't be surprising to anyone that plenty of models have matched its performance on some benchmarks with less parameters.

13

ninadpathak t1_j9hxyog wrote

True, we've seen models a tad bit bigger than GPT3 which are so bad, even GPT 2 would blow them out the water.

Think AI21 Jurassic park or whatever they call their largest model. I hate how stupid it is

6

Professional-Song216 t1_j9hwijh wrote

Great way to look at it, it’s much more important to squeeze the maximum out of your system. Efficiency over excess

2

burnt_umber_ciera t1_j9iqusp wrote

Are you aware of either the "training material" or "training time" or "training techniques" utilized?

1

Zer0D0wn83 t1_j9iwxqu wrote

I'm sure they've read those papers too, you know.

1

ironborn123 t1_j9j2512 wrote

All else being equal, number of model parameters does matter. Well funded startups can acquire the needed data, compute resources, and human talent to build the models. Just like how OpenAI beat Google at this game.

1