_vb__ t1_jdiwjqk wrote on March 24, 2023 at 6:38 PM Reply to comment by Rishh3112 in Cuda out of memory error by Rishh3112 Are you calling the zero_grad method on your optimizer in every step of your training loop? Permalink Parent 3
_vb__ t1_j6ocec9 wrote on January 31, 2023 at 7:21 PM Reply to comment by neuralbeans in Best practice for capping a softmax by neuralbeans No, it would make the logits be closer to one another and the overall model a bit less confident in its probabilities. Permalink Parent 2
_vb__ t1_jdiwjqk wrote
Reply to comment by Rishh3112 in Cuda out of memory error by Rishh3112
Are you calling the zero_grad method on your optimizer in every step of your training loop?