Viewing a single comment thread. View all comments

BlazeObsidian t1_iujbbdu wrote

Are you sure you model is running on the GPU ? See https://towardsdatascience.com/pytorch-switching-to-the-gpu-a7c0b21e8a99 or if you can see GPU utilisation it might be simpler to verify.

If you are not explicitly moving your model to the GPU I think it's running on the CPU. Also how long is it taking ? Do you have a specific time that you compared the performance with ?

1

alexnasla OP t1_iujbukx wrote

Im pretty sure its running on the GPU. I dont remember what the GPU utilization was though, ill take a look when I get a chance.

The test that I mentioned ran for 8 hours.

1

K-o-s-l-s t1_iujldkh wrote

What are you using to log and monitor your jobs? Knowing CPU, RAM, and GPU utilisation will make this a lot easier to understand.

I agree with the poster above; no appreciable speed up switching between a k80 and an a100 makes me suspect that the GPU is not being utilised at all.

1

alexnasla OP t1_iujn3mm wrote

Ok so what I did was actual max out the input buffers to the most the GPU can handle without crashing. So basically fully saturating the VRAM.

1