Why does my Transformer blow GPU memory? Submitted by beautyofdeduction t3_10uuslf on February 6, 2023 at 2:12 AM in deeplearning 12 comments 6
Long_Two_6176 t1_j7gc9rz wrote on February 6, 2023 at 4:32 PM Remember also that computations, not just parameter count, cost GPU memory. Check your intermediate tensor sizes Permalink 1
Viewing a single comment thread. View all comments