Submitted by windoze t3_ylixp5 in MachineLearning
BeatLeJuce t1_iuzz1ku wrote
Layer norm is not about fitting better, but training more easily (activations don't explode which makes optimization more stable).
Is your list limited to "discoveries that are now used everywhere"? Because there are a lot things that would've made it onto your list if you'd compiled it at different points in time but are now discarded (i.e., i'd say they are fads). E.g. GANs.
Other things are currently hyped but it's not clear how they'll end up long term:
Diffusion models are another thing that are currently hot.
Combining Multimodal inputs, which I'd say are "clip-like things".
There's self-supervision as a topic as well (with "contrastive methods" having been a thing).
Federated learning is likely here to stay.
NeRF will likely have a lasting impact, too.
BrisklyBrusque t1_iv6otss wrote
I recall that experimenters disagreed on why batchnorm worked in the first place? has the consensus settled?
BeatLeJuce t1_iv7co26 wrote
No. But we all agree that it's not due to internal covariate shift.
Viewing a single comment thread. View all comments