nibbajenkem

nibbajenkem t1_j8ojasp wrote

Of course, more inductive biases trivially lead to better generalization. Its just not clear to me why you cannot forego the neural network and all its weaknesses and instead simply optimize the coefficients of the physical model itself. I.e in the example in OP, why have a physics-based loss with a prior that it's a damped oscillator instead of just doing regular interpolation on whatever functional class(es) describe the damped oscillators?

I don't have much physics expertise beyond the basics so I might be misunderstanding the true depth of the problem though

1

nibbajenkem t1_j4wii8d wrote

It's pretty simple. Deep neural networks are extremely underspecified by the data they train on https://arxiv.org/abs/2011.03395. Less data means more underspecification and thus the model more readily gets stuck in local minima. More data means you can more easily avoid certain local minima. So the question then boils down to the transferability of the learned features on different datasets. Imagenet pretraining generally works well because its a diverse and large scale dataset, which means models trained on it will by default avoid learning a lot of "silly" features.

14