Viewing a single comment thread. View all comments

pwsiegel t1_jcbjf82 wrote

It's a property of the field in general - there is very little theory to guide neural architecture design, just some heuristics backed by trial-and-error experimentation. Deep learning models are fun, but in practice you spend a lot of your time trying to trick gradient descent into converging faster.

22

currentscurrents t1_jcclq02 wrote

The whole thing seems very bitter lesson-y and I suspect in the future we'll have a very general architecture that learns to reconfigure itself for the data.

5