Submitted by faker10101891 t3_10cxuo2 in MachineLearning
I'm aware transformers are pretty vram hungry and a 4080 only has 16 GB. So I am guessing a lot of transformer based models will be out of the question. At least anything that is interesting.
Not sure about other models though. Is there anything I can do with a 4080 that's beyond just some toy experiment?
junetwentyfirst2020 t1_j4jgu4t wrote
I’m not sure why you think that that’s such a crummy graphics card. I’ve trained a lot of interesting things for grad school and even in the work place on 4GB less. If you’re fine tuning then it’s not really going to take that long to get decent results, and 16 GB is not bad.