Viewing a single comment thread. View all comments

suflaj t1_ir83jmj wrote

I'm talking about detach. From what I could find on the internet the copy part is taking tensor data and wrapping it in a variable. This does not imply that an actual copy in memory happens. And from what I understand to get a hard copy you have to clone the detached tensor.

If all OP does is detach tensors, then it's O(1). But we can't know that without further information, so I elaborated that it's likely closer to O(n) because I presume they might be doing something beyond detach.

1

mishtimoi OP t1_ir9wwlx wrote

Yea this makes sense. If it's only detach for all layers it's like the .eval() method which needs to probably make a copy (as per your explanation) once of the whole model footprint but in this case, it has to keep multiple copies at every point I detach, I guess.

1