Own-Archer7158 t1_iy3mec9 wrote
Reply to comment by Own-Archer7158 in Can someone pls help me with this question and explain to me why you chose your answers? Many thanks! by CustardSignificant24
Note that the minimal loss is reached when the parameters make neural network predictions the closest to the real labels
Before that, the gradient is non zero generally (except for an very very unlucky local minimum)
You could see the case of the linear regression with least square error as loss to understand better the underlying optimization problem (in one dimension, it is a square function to minimize, so no local minimum)
Viewing a single comment thread. View all comments