Viewing a single comment thread. View all comments

Ok_Firefighter_2106 t1_iy6t95h wrote

2,3

2:For example you use zero values for initialization, due to the symmetric nature of NN, now all neurons become the same, then the multi-layer NN is equal to a simple linear regression since the NN fails to break the symmetry. Therefore, is the problem is non-linear, the NN just can't learn.

​

3: as explained in other answers.

1