Ok_Firefighter_2106 t1_iy6t95h wrote
2,3
2:For example you use zero values for initialization, due to the symmetric nature of NN, now all neurons become the same, then the multi-layer NN is equal to a simple linear regression since the NN fails to break the symmetry. Therefore, is the problem is non-linear, the NN just can't learn.
​
3: as explained in other answers.
Viewing a single comment thread. View all comments