hassan789_

hassan789_ t1_jb7hzjx wrote

Lack of quality information. There's a max of 12 trillion high quality token for LLMs to learn from. After that, the returns could diminish (maybe 10% new quality info is added per year). Right now, largest models are trained on 1T tokens..

1