Viewing a single comment thread. View all comments

harharveryfunny t1_jbjxolz wrote

The LLM name for things like GPT-3 seems to have stuck, which IMO is a bit unfortunate since it's rather misleading. They certainly wouldn't need the amount of data they do if the goal was merely a language model, nor would we need to have progressed past smaller models like GPT-1. The "predict next word" training/feedback may not have changed, but the capabilities people are hoping to induce in these larger/ginormous models is now way beyond language and into the realms of world model, semantics and thought.

2