Viewing a single comment thread. View all comments

SquareRootsi t1_isjsnk6 wrote

I haven't vetted this yet, but it looks pretty well done from my first glance. It compares multiple models against multiple tasks, so you can hone in on your specific needs.

https://gem-benchmark.com/results

I think huggingface has something similar, but I haven't found all the info in a single page that's easy to compare. You kind of have to bounce around between various model cards, tasks, and metrics pages to find similar info.

2