Submitted by PleaseKillMeNowOkay t3_xtadfd in deeplearning
I have a neural network whose outputs are the parameters of a probability distribution. I have another neural network whose outputs are the parameters of a probability distribution with a more general covariance structure than the first one (It can be reduced to the first pdf). Is my second network going to perform at least as well as my first network?
Apologies for the vague description. I am not sure how much I'm allowed to talk about it. Literally, any help is appreciated.
thebear96 t1_iqqsaoe wrote
Assuming same hyperparameters, the second network theoretically should converge to the solution quicker. So one will need to modify the hyperparameters and maybe add some dropouts so that the model doesn't overfit.