Why bigger transformer models are better learners? Submitted by begooboi t3_119zmpd on February 23, 2023 at 2:56 PM in deeplearning 15 comments 7
suflaj t1_j9swm0z wrote on February 24, 2023 at 9:11 AM Reply to comment by junetwentyfirst2020 in Why bigger transformer models are better learners? by begooboi I'm not sure what you mean. I'm using the usual definition of noise. Permalink Parent 2
Viewing a single comment thread. View all comments