Quaxi_

Quaxi_ t1_j6421fo wrote

And while being easier to train, they give better results.

Diffusion models are also so much more versatile in their application because of their iterative process.

You can do inpainting or img-to-img for example by just conditioning the noise in different ways. You would have to retrain the whole GAN to achieve that.

3

Quaxi_ t1_is7gnsf wrote

No definitely - GANs can still fail and they are much less stable than Diffusion models. But GANs have still enjoyed a huge popularity despite that and research has found ways to mitigate it.

I just think it's not the main reason why diffusion models are gaining traction. If it was we probably would have seen a lot more of Variational Autoencoders. My work is not at BigGAN or DALLE2 scale though so might indeed miss some scaling aspect of this. :)

2

Quaxi_ t1_is4woj1 wrote

I wouldn't say it is primarily because it is more stable though, it just gives better results and the properties of diffusion easily leads into other applications like in/outpainting and multimodality.

GANs are quite stable these days. Tricks like feature matching loss, spectral normalization, gradient clipping, TTUR etc makes modal collapse quite rare.

You're correct that it is quite slower at the moment though. The diffusion process needs to iterate per pass and thus takes longer both to train and to infer.

11