Submitted by PleaseKillMeNowOkay t3_xtadfd in deeplearning
sydjashim t1_iqtuqdt wrote
Reply to comment by thebear96 in Neural network that models a probability distribution by PleaseKillMeNowOkay
Can you reason out why the model will get converged quicker ?
thebear96 t1_iqtxrnx wrote
Well I assumed that the network had more layers and so more parameters. More parameters can represent data much better and quicker. For example if you had a dataset with 30 features, and you use a Linear layer with 64 neurons, it should be able to represent each data point much quicker and easier than let's say a linear layer with 16 neurons. That's why I think the model would get converged quicker. But in OPs case his hidden layers are the same, only the output layer has more neurons. In that case we won't have a quick convergence.
Viewing a single comment thread. View all comments