CrazyCrab t1_izg3hw6 wrote on December 8, 2022 at 9:16 PM

Reply to comment by _Arsenie_Boca_ in [D] Determining the right time to quit training (CNN) by thanderrine

Recently, I have overfit to the validation dataset by doing this. The task is semantic segmentation. I trained for a very long time and I took the model with the best validation loss. Well, I got 0.02 nats/pixel cross entropy on val and 0.04 on train, 14% iou on val vs 24% on train.

_Arsenie_Boca_ t1_izg7d49 wrote on December 8, 2022 at 9:41 PM

Not sure how this indicates overfitting on the validation set? Wouldnt this be indicated by much worse performance on test compared to validation set? Havent done a lot of image segmentation work, is this specific to the task?

CrazyCrab t1_izg96t3 wrote on December 8, 2022 at 9:54 PM

I don't have a test set. It's not specific to a task.

_Arsenie_Boca_ t1_izgbbrj wrote on December 8, 2022 at 10:08 PM

Then how can you tell if you overfitted on the validation set?

CrazyCrab t1_izgcu6i wrote on December 8, 2022 at 10:19 PM

Ok, so my annotated data consists of about 50 images of size 10000x5000 pixels on average. The task is binary segmentation. Positives constitute approximately 8% of all pixels. 38 images are in the training part, 12 images are in the test part (I divided them randomly).

The batch cross entropy plot and the validation cross entropy plot were crazy unstable during training. After a little bit of training there mostly wasn't any stable trend in either going up or down. However, as the time went on, the best validation cross entropy over all checkpoints went down and went down...

So I think my checkpoint-selecting method gave me a model overfit to the validation dataset. I.e., I expect that on future samples the performance will be more like on the training dataset than on the validation dataset. The only other likely explanation I can think of is that I got unlucky and my validation dataset turned out to be significantly easier than my training dataset.