Submitted by NadaBrothers t3_10kmc7n in MachineLearning

I work as as a researcher and am kind of new to neural networks. I have an RNN (1e4 x 1e4 network) that I would like to train in either MATLAB or Julia.

One option I considered is writing my own code for Hessian-free optimization, but the implementational details are really, really hard to figure out.

I am aware there is a Theano or TF implementation of HFO but I I am primarily interested in having the code in MATLAB/Julia.

Also, are there better/alternative techniques than Hessian-free optimization for training RNN's ?

6

Comments

You must log in or register to comment.

limpbizkit4prez t1_j5rmvr6 wrote

I know you said you are interested in MATLAB or Julia, but I'm interested in why not a python library? I mean a simple Google search would show lots of pytorch HFO solutions.

5

Gemabo t1_j5rwh5f wrote

Matlab has a deep learning Toolbox that makes it easy and efficient to train any type of model. Including RNNs. Although, there is a good argument (and a famous paper) that anything you can do with RNN you can do better with CNN. Julia has deep learning libraries, but don't expect nearly the level of support and ease of use as Matlab. Matlab's DL is underrated.

4

FinancialElephant t1_j5s5y72 wrote

Flux.jl is the most popular deep learning library in Julia. I've played around with it a little, it's quite nice and easy to use. It is amazing how much more elegant the implementations become in julia compared to python.

There is also the less known Lux.jl package that is essentially an explicitly parameterized Flux (less mature than Flux though).

6

serge_cell t1_j5sj24c wrote

Hessian-free second order will not likely work. There are reasons why everyone using gradient descent. The only working second order method seems K-FAC (disclaimer - I have no first hand experience) but as you will use Julia you will have to implement it from scratch, and it's highly non-trivial (as you can expect from method which work where other failed)

3

yarasa t1_j5tid8i wrote

Can you not train in python and dump the results to a file and run analysis on that? Either you have to be an expert in the details of the implementation or you have to use the setup everyone else is using.

1