andreichiffa t1_j5uczy3 wrote
*Lecun. And their Galactica was subject of so much ridicule that after pompous launch it was in-launched 48 hours later. OPT-175B is a clone of OpenAI’s GPT3, but performs worth and is essentially a massive pain in the ass cyber-security and phishing/desinformation.
Lecun always was into CovNets for machine vision - text-to-text is Hinton, Bengio, and Sutskever.
So far it looks like Baidu and Google have bigger transformer-based models that could perform better, but only Google’s PaLM is architecturally different enough to potentially perform better.
There are also augmented variants of Transformer-based model that are capable of more factual response, but they tend to be less conversational.
Viewing a single comment thread. View all comments