HelpfulFriend0 t1_isinxpf wrote on October 16, 2022 at 7:40 AM Reply to [R] UL2: Unifying Language Learning Paradigms - Google Research 2022 - 20B parameters outperforming 175B GTP-3 and tripling the performance of T5-XXl on one-shot summarization. Public checkpoints! by Singularian2501 Any idea how it does relative to Alexa teacher model? https://www.amazon.science/publications/alexa-teacher-model-pretraining-and-distilling-multi-billion-parameter-encoders-for-natural-language-understanding-systems Permalink 3
HelpfulFriend0 t1_isinxpf wrote
Reply to [R] UL2: Unifying Language Learning Paradigms - Google Research 2022 - 20B parameters outperforming 175B GTP-3 and tripling the performance of T5-XXl on one-shot summarization. Public checkpoints! by Singularian2501
Any idea how it does relative to Alexa teacher model? https://www.amazon.science/publications/alexa-teacher-model-pretraining-and-distilling-multi-billion-parameter-encoders-for-natural-language-understanding-systems