Viewing a single comment thread. View all comments

serge_cell t1_j5sj24c wrote

Hessian-free second order will not likely work. There are reasons why everyone using gradient descent. The only working second order method seems K-FAC (disclaimer - I have no first hand experience) but as you will use Julia you will have to implement it from scratch, and it's highly non-trivial (as you can expect from method which work where other failed)