Viewing a single comment thread. View all comments

dat_cosmo_cat t1_iwteguv wrote

You and I are literally saying the same things. These models have been in prod on every major software platform since BERT.

We don't even need to look at offline eval metrics anymore. If you're an actual MLE / data scientist you likely have the pipelines set up which directly measure the engagement / attributable sales differences and report the real business impact across millions of users each time a new model is released.

I work on a team that has made millions of dollars building applications on top of LLMs since 2018, so when I see the claim "LLMs finally got good this year" it's hard not to laugh. --this is what I am getting at.

Edit*: did you read the article?

5