The_Sodomeister t1_isci8nh wrote on October 14, 2022 at 10:23 PM

Reply to comment by grid_world in Variational Autoencoder automatic latent dimension selection by grid_world

This is the part I'm referencing:

> The unneeded dimensions don't learn anything meaningful and therefore remain a standard, multivariate, Gaussian distribution. This serves as a signal that such dimensions can safely be removed without significantly impacting the model's performance.

How do you explicitly measure and use this "signal"? I don't think you'd get far by just measuring "farness away from Gaussian", as you'd almost certainly end up throwing away certain useful dimensions that may simply appear "Gaussian enough".

> I didn’t get your point of looking at the decoder weights to figure out whether they are contributing? Do you compare them to their randomly initiated values to infer this?

If the model reaches this "optimal state" where certain dimensions aren't contributing to the decoder output, then you should be able to detect this with some form of sensitivity analysis - i.e. changing the values in those dimensions shouldn't affect the decoder output, if those dimensions aren't being used.

This assumes that the model would correctly learn to ignore unnecessary latent dimensions, but I'm not confident it would actually accomplish that.

The_Sodomeister t1_iscenee wrote on October 14, 2022 at 9:57 PM

Reply to Variational Autoencoder automatic latent dimension selection by grid_world

*If* your hypothesis is true (and I don't have enough direct experience with VAE to say for certain), then how would you distinguish those layers which are outputting approximately-Gaussian noise from the layers which are outputting meaningful signals? Whose to say that the meaningful signal doesn't also appear approximately Gaussian? Or at least sufficiently Gaussian to be not easily distinguishable from the others.

While I wouldn't go so far as to say that your hypothesis "doesn't happen", I also know from personal experience with other networks that NN models will tend to naturally over-parameterize if you let them. Regularization methods don't usually prevent the model from utilizing extra dimensions when it is able to, and it's not always clear whether the model could achieve the same performance with fewer dimensions vs. whether the extra dimensions are truly adding more representation capacity.

If some latent dimensions truly aren't contributing at all to the meaningful encoding, then I would think you could more likely identify this from looking at the weights in the decoder layers (as they wouldn't be needed to reconstruct the encoded input). I don't think this is as easy as it sounds, but I find it more plausible then determining this information strictly from comparing the distributions of the latent dimensions.