YouAgainShmidhoobuh

YouAgainShmidhoobuh t1_jd2qmh1 wrote

Not entirely the same thing. VAEs offer approximate likelihood estimation, but not exact. The difference here is key - VAEs do not optimize the log-likelihood directly but they do so through the evidence lower bound, an approximation. Flow based methods are exact methods - we go from an easy tractable distribution to a more complex one, guaranteeing at each level that the learned distribution is actually a legit distribution through the change of variables theorem.

Of course, the both (try) to learn some probability distribution of the training data, and that is how they would differ from GAN approaches that do not directly learn a probability distribution.

For more insight you might want to look at https://openreview.net/pdf?id=HklKEUUY_E

2

YouAgainShmidhoobuh t1_jd2n2v5 wrote

ResNets do not tackle the vanishing gradient problem. The authors specifically mention that the issue of vanishing gradients was already fixed because of BatchNorm in particular. So removing BatchNorm from the equation will most likely lead to vanishing gradients.

I am assuming you are doing a WGAN approach since that would explain the gradient penalty violation. In this case, use LayerNorm as indicated here: https://github.com/LynnHo/DCGAN-LSGAN-WGAN-GP-DRAGAN-Tensorflow-2/issues/3

3