SquareRootsi t1_isjsnk6 wrote on October 16, 2022 at 3:01 PM

Reply to comment by EducationalCicada in [R] UL2: Unifying Language Learning Paradigms - Google Research 2022 - 20B parameters outperforming 175B GTP-3 and tripling the performance of T5-XXl on one-shot summarization. Public checkpoints! by Singularian2501

I haven't vetted this yet, but it looks pretty well done from my first glance. It compares multiple models against multiple tasks, so you can hone in on your specific needs.

https://gem-benchmark.com/results

I think huggingface has something similar, but I haven't found all the info in a single page that's easy to compare. You kind of have to bounce around between various model cards, tasks, and metrics pages to find similar info.