Viewing a single comment thread. View all comments

limpbizkit4prez t1_jai7l96 wrote

It matters because the authors continue to increase model capacity to do better on a single task and that's it. They also determined that strategy, not the LLM. It would be way cooler if they constrained the problem to roughly the same number of parameters and showed generalization across multiple tasks. Again, it's neat, just not innovative or sexy.

2