rodeowrong t1_j3oaq7n wrote on January 9, 2023 at 11:06 PM

So, is it worth exploring or not? I don't know if I should spend 2 months trying to understand the diffusion models only to find it can never be better. Vae based models had the same fate. I was studying them and suddenly transformers took over.

Ramys t1_j3pcxd1 wrote on January 10, 2023 at 3:32 AM

VAEs are running under the hood in stable diffusion. Instead of denoising a 512x512x3 image directly, the image is encoded with a VAE to a smaller latent space (i think 64x64x4). The denoising steps happen in the latent space, and finally the VAE decodes the result back to color space. This is how it can run relatively quickly and on machines that don't have tons of VRAM.

So it's not necessarily the case that these techniques die. We can learn and incorporate them in larger models.

[deleted] t1_j3opz0l wrote on January 10, 2023 at 12:51 AM

I think worth looking at for sure. The math behind isn’t “that” complex and the idea is pretty intuitive in my opinion. Take that from someone who took months to wrap their head around attention as a concept lol.

thecodethinker t1_j3pichs wrote on January 10, 2023 at 4:14 AM

Attention is still pretty confusing for me. I find diffusion much more intuitive fwiw.

DigThatData t1_j3v2gjs wrote on January 11, 2023 at 6:40 AM

attention is essentially a dynamically weighted cross-product. if you haven't already seen this blog post, it's one of the more popular explanations: https://jalammar.github.io/illustrated-transformer/

benanne OP t1_j3qy47x wrote on January 10, 2023 at 1:52 PM

I have an earlier blog post which is intended precisely to build intuition about diffusion :) https://benanne.github.io/2022/01/31/diffusion.html

DigThatData t1_j3v26zy wrote on January 11, 2023 at 6:37 AM

i think you read that comment backwards :)

[deleted] t1_j3ocwgh wrote on January 9, 2023 at 11:21 PM

[deleted]

gamerx88 t1_j3qft42 wrote on January 10, 2023 at 10:44 AM

What do you mean Transformers took over? In what area or sense? You mean took over in popularity?