Viewing a single comment thread. View all comments

UnusualClimberBear t1_iqna2rw wrote

The full gradient does not work well for NN. Plus adam has a coarse estimate of the curvature, so it would be more of a second-order method even if you can find some functions where the proposed estimates are not good.

8