Viewing a single comment thread. View all comments

JustOneAvailableName t1_j7v99gd wrote

If the model is sufficiently large (if not, you don't really need to wait long anyways) and no expensive CPU pre/postprocessing is done, the 3090 will be the bottleneck.

A single 3090 might not have enough memory to train GPT 2 large, but it's probably close.

Fully training a LLM on a single 3090 is impossible, but you could finetune one.

3