Viewing a single comment thread. View all comments

ppg_dork t1_iyue984 wrote

I think the answer is going to depend heavily on your domain. For the types of datasets I typically use, I will keep the training going for a while, drop the learning rate once or twice if the validation loss keeps going down, and eventually stop when overfitting begins or loss plateaus.

However, I typically work on (relatively) simple CV problems. I've heard from colleagues that they sometimes train well beyond validation loss beginning to increase as the loss will eventually drop back down. However, this seems more common with RL or generative models. I'll reiterate that I'm regurgitating what others have mentioned.

If your goal ISN'T to shoot for SOTA, I would look at the "best" examples of applied literature in a relevant field. I've often found that those papers provide recipes that transfer well to similar problems. Whereas the SOTA-setting papers tend to be quite specialized to ImageNet/CIFAR-whatever/etc.

4