[R] Getting GPT-3 quality with a model 1000x smaller via distillation plus Snorkel Submitted by bradenjh t3_z26fui on November 22, 2022 at 9:59 PM in MachineLearning 9 comments 23
ayse_ww t1_ixgbva3 wrote on November 23, 2022 at 5:48 AM Reply to comment by bradenjh in [R] Getting GPT-3 quality with a model 1000x smaller via distillation plus Snorkel by bradenjh This is quite interesting. Is such self-training scheme similar to recurrent network? Permalink Parent 0
Viewing a single comment thread. View all comments