I_draw_boxes t1_jcia41b wrote
Reply to comment by ggf31416 in [D] Choosing Cloud vs local hardware for training LLMs. What's best for a small research group? by PK_thundr
A fix for the Nvidia driver is forthcoming for the P2P related issue with PyTorch DDP training. The 3090 didn't support P2P either and the bug fix won't enable P2P for the 4090, but it will correct the issue and should train much faster once fixed.
Viewing a single comment thread. View all comments