Submitted by Available_Lion_652 t3_10xu09v in MachineLearning
JustOneAvailableName t1_j7v99gd wrote
If the model is sufficiently large (if not, you don't really need to wait long anyways) and no expensive CPU pre/postprocessing is done, the 3090 will be the bottleneck.
A single 3090 might not have enough memory to train GPT 2 large, but it's probably close.
Fully training a LLM on a single 3090 is impossible, but you could finetune one.
Viewing a single comment thread. View all comments