drekmonger t1_j9hvs1w wrote
Number of parameters is not the whole story. Quality of training material and training time and training techniques matter as much or more.
The larger models require more resources for inference, as well. I'd be more impressed by a model smaller than GPT-3 that performed just as well.
Hands0L0 t1_j9i277j wrote
I for one welcome competition in the race to AGI
xott t1_j9i2zg3 wrote
It's the new Space Race
fumblesmcdrum t1_j9ib2lt wrote
latent space race
Hands0L0 t1_j9jqo4a wrote
Fuck dude, that's clever
phoenixmusicman t1_j9imtgj wrote
It's not a space race until Governments start pouring massive amounts of their GDP into it.
Artanthos t1_j9l2n4a wrote
China is pouring money into AI research.
spreadlove5683 t1_j9i58z2 wrote
I sure don't. We need to get it right. Not barrel ahead in an arms race.
amplex1337 t1_j9ir5dt wrote
You know the closer we get to AGI the more that will happen. Every government will want to be the first in control of an ASI which will basically make them the dominant superpower of the world. It will be as dystopian as it sounds.
[deleted] t1_j9j6qqt wrote
[deleted]
beachmike t1_j9k5205 wrote
There will be both good and bad that comes out as we get closer to AGI, and attain AGI, just like and other technological revolution. To paint it as "either" dystopian or utopian is naive.
Artanthos t1_j9l3i0n wrote
It depends. We cannot see the other side of a singularity.
We could have an alignment issue and end up as paper clips.
AI could solve everything from climate change to social inequality by reducing the human race to 50 million Stone Age hunter gatherers.
Or, you could have the top 1% living in a utopia while everyone else is living in a dystopia.
Ziggy5010 t1_j9j1id1 wrote
Agreed
dangeratio t1_j9igzb4 wrote
Check out Amazon’s multimodal chain of thought model, only 738 million and scores better on all question classes than ChatGPT. See table 4 on page 7 here - https://arxiv.org/pdf/2302.00923.pdf
Destiny_Knight t1_j9iupzk wrote
What the actual fuck is that paper? The thing performed better than a human at several different question classes.
At fucking less than one billion parameters. 100x less than GPT 3.5.
Edit: For clarity, I am impressed not angry lol.
IluvBsissa t1_j9j5t08 wrote
Are you angry or impressed ?
Destiny_Knight t1_j9j6iq0 wrote
impressed lol
IluvBsissa t1_j9j6v5v wrote
If these models are so smol and efficient, why are they not released ?? I just don't get it. I thought PaLM was kept private because it was too costly to run to be profitable...
kermunnist t1_j9kqsaw wrote
That's because the smaller models are less useful. With neural networks (likely including biological ones) there's a hard trade off between specialized performance and general performance. If these 100+x smaller models were trained on the same data as GPT-3 they would perform 100+x worse on these metrics (maybe not exactly because in this case the model was multimodal which definitely gave a performance advantage). The big reason this model performed so much is because it was fine tuned on problems similar to the ones on this exam where as GPT-3 was fine turned on anything and everything. This means that this model would likely not be a great conversationalist and would probably flounder at most other tasks GPT-3.5 does well on.
drekmonger t1_j9iios3 wrote
Heh. I tried their rationalization step with ChatGPT, just with prompting. For their question about the fries and crackers it said the problem is flawed, because there's such a thing as crackers with low or no salt. Also correctly inferred that fries are usually salted, but don't have to be. (of course, it didn't have the picture to go by, which was the point of the research)
Great paper though. Thanks for sharing.
challengethegods t1_j9i1lk3 wrote
>I'd be more impressed by a model smaller than GPT-3 that performed just as well.
from the article: "Aleph Alpha’s model is on par with OpenAI’s GPT-3 davinci model, despite having fewer parameters.", so... you're saying you would be even more impressed if it used even fewer parameters? Anyway I think anyone could guess gpt3 is poorly optimized so it shouldn't be surprising to anyone that plenty of models have matched its performance on some benchmarks with less parameters.
ninadpathak t1_j9hxyog wrote
True, we've seen models a tad bit bigger than GPT3 which are so bad, even GPT 2 would blow them out the water.
Think AI21 Jurassic park or whatever they call their largest model. I hate how stupid it is
musing2020 t1_j9it8e1 wrote
Achieving GPT 175B Level Accuracy with a 10x More Efficient Model
https://sambanova.ai/blog/achieving-gpt-175b-level-accuracy-with-a-10x-more-efficient-model/
Professional-Song216 t1_j9hwijh wrote
Great way to look at it, it’s much more important to squeeze the maximum out of your system. Efficiency over excess
burnt_umber_ciera t1_j9iqusp wrote
Are you aware of either the "training material" or "training time" or "training techniques" utilized?
Zer0D0wn83 t1_j9iwxqu wrote
I'm sure they've read those papers too, you know.
ironborn123 t1_j9j2512 wrote
All else being equal, number of model parameters does matter. Well funded startups can acquire the needed data, compute resources, and human talent to build the models. Just like how OpenAI beat Google at this game.
Viewing a single comment thread. View all comments