Viewing a single comment thread. View all comments

Mental-Reference8330 t1_j8xup7w wrote

in the early days, researchers considered the architecture itself to be a form of regularization. LeCunn didn't invent it, but he did popularize the idea that a convolutional layer (like LeNet in his case) is like a fully-connected layer, but constrained to only allow solutions where the layer weights could be expressed in terms of a convolution kernel. In their introduction, ResNets were also motivated by the fact that they're "constrained" to start from better minima, even though you could also convert a resnet model to a fully-connected model without loss of precision.

1