[R] Tips on training Transformers Submitted by parabellum630 t3_z088fo on November 20, 2022 at 4:23 PM in MachineLearning 23 comments 78
drivanova t1_ix9vpi7 wrote on November 21, 2022 at 9:17 PM Reply to comment by fasttosmile in [R] Tips on training Transformers by parabellum630 that + decent lr scheduler, e.g. linear ramp up + exponential/cosine annealing Permalink Parent 1
Viewing a single comment thread. View all comments