ButthurtFeminists

ButthurtFeminists t1_ir0zlg1 wrote

Im surprised this one hasnt been mentioned already.

Long training could be due to model complexity and/or dataset size. Therefore, you could use a subset of your dataset if it's difficult to downscale your model. For example, let's say I'm training a Resnet152 model on ImageNet - if I wanted to reduce training time for hyperparameters, I could sample a subset of Imagenet (maybe 1/10 the size) and tune hyperparams on that, then test the best hyperparameter on the full dataset.

101