Viewing a single comment thread. View all comments

BeatLeJuce t1_iuzz1ku wrote

Layer norm is not about fitting better, but training more easily (activations don't explode which makes optimization more stable).

Is your list limited to "discoveries that are now used everywhere"? Because there are a lot things that would've made it onto your list if you'd compiled it at different points in time but are now discarded (i.e., i'd say they are fads). E.g. GANs.

Other things are currently hyped but it's not clear how they'll end up long term:

Diffusion models are another thing that are currently hot.

Combining Multimodal inputs, which I'd say are "clip-like things".

There's self-supervision as a topic as well (with "contrastive methods" having been a thing).

Federated learning is likely here to stay.

NeRF will likely have a lasting impact, too.

3

BrisklyBrusque t1_iv6otss wrote

I recall that experimenters disagreed on why batchnorm worked in the first place? has the consensus settled?

1

BeatLeJuce t1_iv7co26 wrote

No. But we all agree that it's not due to internal covariate shift.

2