junetwentyfirst2020 t1_j4wrzut wrote on January 18, 2023 at 8:18 PM

The way I like to think about this is that the algorithm has to model many things. If you’re trying to learn whether the image contains a dog or not, first you have to model natural imagery, correlations between features, and maybe even a little 2D-to-3D to simplify invariances. I’m speaking hypothetically here, because the underlying model is quite latent and hard to inspect.

If you train from scratch you need to do all of these tasks on a dataset that is likely much smaller than is required to do all of them without overfitting. If you use a pretrained model, instead of learning all of those tasks, you instead have a model that only has to learn just one additional thing on the same amount of data.