[R] UL2: Unifying Language Learning Paradigms - Google Research 2022 - 20B parameters outperforming 175B GTP-3 and tripling the performance of T5-XXl on one-shot summarization. Public checkpoints! Submitted by Singularian2501 t3_y4tp4b on October 15, 2022 at 5:31 PM in MachineLearning 14 comments 190
massimosclaw2 t1_ishdjbw wrote on October 16, 2022 at 12:18 AM I wonder how this will perform on out of distribution stuff + remembering obscure references like "Alfred Korzybski" (as GPT-3 does), and what they are related to or if 20B parameters is too small to memorize enough Permalink 6
Viewing a single comment thread. View all comments