Viewing a single comment thread. View all comments

Small_Stand_8716 t1_ir6a9ja wrote

I'm not aware of the performance of detach, but why not set requires_grad to False to freeze some layers? It will tremendously speed up training and memory usage.

4

mishtimoi OP t1_ir6pvy8 wrote

I also tried set_grad=False but did not see much improvement.

−1