Viewing a single comment thread. View all comments

andreichiffa t1_j5uczy3 wrote

*Lecun. And their Galactica was subject of so much ridicule that after pompous launch it was in-launched 48 hours later. OPT-175B is a clone of OpenAI’s GPT3, but performs worth and is essentially a massive pain in the ass cyber-security and phishing/desinformation.

Lecun always was into CovNets for machine vision - text-to-text is Hinton, Bengio, and Sutskever.

So far it looks like Baidu and Google have bigger transformer-based models that could perform better, but only Google’s PaLM is architecturally different enough to potentially perform better.

There are also augmented variants of Transformer-based model that are capable of more factual response, but they tend to be less conversational.

1