lookatmetype
lookatmetype t1_j62j0t3 wrote
Reply to comment by currentscurrents in [R] Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers by currentscurrents
is there anything he hasn't done?
lookatmetype t1_j47o3hu wrote
Reply to comment by nohat in [D] Bitter lesson 2.0? by Tea_Pearce
yeah i'm lost because i literally don't understand the distinction
lookatmetype t1_j64nstm wrote
Reply to comment by CKtalon in [R] SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot by Secure-Technology-78
To be fair, most of the weights in every "Foundation" model are useless.