Viewing a single comment thread. View all comments

royalemate357 t1_j9rzbbc wrote

>When they scale they hallucinate more, produce more wrong information

Any papers/literature on this? AFAIK they do better and better on fact/trivia benchmarks and whatnot as you scale them up. It's not like smaller (GPT-like) language models are factually more correct ...

1

wind_dude t1_j9s1cr4 wrote

I'll see if I can find the benchmarks, I believe there are a few papers from IBM and deepmind talking about it. And a benchmark study in relation to flan.

1