Latter_Security9389 t1_izirxj3 wrote on December 9, 2022 at 12:28 PM

#892,530

You are missing a test split. It's common to pick the best validation checkpoint but you still want a test split (that's completely unseen during training/model picking) to test your model.

You also need to be careful with the metrics you look at to test your model because your classes are very imbalanced.

CrazyCrab OP t1_izisk28 wrote on December 9, 2022 at 12:35 PM

#892,552

Replying to Latter_Security9389 (#892,530)

I think with this few images I can't afford having a test set. Also, I thought that since I have approximately 50 million pixels to classify in the validation dataset, and given that computer vision practicioners often don't have a test split, I don't really need a test split. Now I'm not sure.

Nameless1995 t1_izit9gd wrote on December 9, 2022 at 12:42 PM

#892,576

Perhaps you can try cross-validation.

plocco-tocco t1_iziu4zy wrote on December 9, 2022 at 12:50 PM

#892,612

Replying to CrazyCrab (#892,552)

Do 5 or 10 fold cross validation in this case. Often used when there is not a lot of data.

CrazyCrab OP t1_iziuhq4 wrote on December 9, 2022 at 12:54 PM

#892,631

Replying to plocco-tocco (#892,612)

Do you suggest doing cross validation with the training stopping mechanism "train for precisely the same number of steps I did in this run" or with "train using checkpointing and choosing the best checkpoint as I did in this run"?

plocco-tocco t1_izj4iy8 wrote on December 9, 2022 at 2:17 PM

#893,075

Replying to CrazyCrab (#892,631)

I would take the best checkpoints (aka when the validation loss starts diverging from the training loss). Not the same number of steps because it can happen that the networks don't converge to a minima at the same time, some may be stuck somewhere for longer.

[D] Did I overfit to val by choosing the best checkpoint?

Comments