trajo123 t1_jdhi7u8 wrote on March 24, 2023 at 1:08 PM

Reply to comment by Rishh3112 in Cuda out of memory error by Rishh3112

The problem is likely in your training loop. Perhaps your computation graphs keeps going because you keep track of the average loss as an autograd variable rather than a plain numerical one. Make sure that for any metrics/logging you use loss.item().

humpeldumpel t1_jdhpl0w wrote on March 24, 2023 at 2:02 PM

And also make use of the training and validation mode of the model

Rishh3112 OP t1_jdhib79 wrote on March 24, 2023 at 1:08 PM

sure ill will give it a try thanks a lot.

Rishh3112 OP t1_jdhiguj wrote on March 24, 2023 at 1:10 PM

i just checked in my training loop I'm using loss.item()

_vb__ t1_jdiwjqk wrote on March 24, 2023 at 6:38 PM

Are you calling the zero_grad method on your optimizer in every step of your training loop?