ButthurtFeminists
ButthurtFeminists t1_ir1h1ir wrote
Reply to comment by bphase in [D] How do you go about hyperparameter tuning when network takes a long time to train? by twocupv60
This could work as well, but there may be slight differences - it's inherently harder to converge training on larger datasets. So if your goal is to see how the model performs given that you converged on the dataset, then running with fewer epochs may not be the best choice.
ButthurtFeminists t1_ir0zlg1 wrote
Reply to [D] How do you go about hyperparameter tuning when network takes a long time to train? by twocupv60
Im surprised this one hasnt been mentioned already.
Long training could be due to model complexity and/or dataset size. Therefore, you could use a subset of your dataset if it's difficult to downscale your model. For example, let's say I'm training a Resnet152 model on ImageNet - if I wanted to reduce training time for hyperparameters, I could sample a subset of Imagenet (maybe 1/10 the size) and tune hyperparams on that, then test the best hyperparameter on the full dataset.
ButthurtFeminists t1_ir6x9pv wrote
Reply to Time Complexity of Detach() in torch "[R]" by mishtimoi
How much of a slowdown?? I'm interested in this as well