Submitted by hardmaru t3_ys36do in MachineLearning
DrXaos t1_iw04agd wrote
Reply to comment by vjb_reddit_scrap in [R] ZerO Initialization: Initializing Neural Networks with only Zeros and Ones by hardmaru
That’s a different scenario and clearly dynamically justified.
Any recursive neural network is like a nonlinear dynamical system. Learning happens best on the boundary of dissipation vs chaos (exploding or vanishing gradients).
The additive incorporation of new info in LSTM/GRU greatly ameliorates that usual problem of RNNs with random transition matrices where perturbations evolve multiplicatively. RNN initted to zero Lyapunov exponent through identity is helpful.
Viewing a single comment thread. View all comments