jeankaddour t1_irdvad7 wrote
Reply to comment by jeankaddour in [R] Stop Wasting My Time! Saving Days of ImageNet and BERT Training with Latest Weight Averaging by rlresearcher
Thanks again for this feedback. I haven't trained on a different dataset yet, but I replaced all BERT perplexity numbers/plots with the MLM losses in the meantime. The paper has been updated today on Arxiv.
Viewing a single comment thread. View all comments