Viewing a single comment thread. View all comments

massimosclaw2 t1_ishdjbw wrote on October 16, 2022 at 12:18 AM

I wonder how this will perform on out of distribution stuff + remembering obscure references like "Alfred Korzybski" (as GPT-3 does), and what they are related to or if 20B parameters is too small to memorize enough