Hey everyone!

Over the past couple years, we've all seen the incredible performance of Diffusion Models in Stable Diffusion, DALL-E 2, etc. A team at MIT recently published a new model that is inspired by physics much like diffusion models.

I wrote an introduction to Poisson Flow Generative Models as an explainer for the models, which use concepts from electrodynamics to generate data.

Some highlights:

10-20x faster than diffusion models on image generation with comparable performance
Invertible mapping akin to normalizing flows allows for explicit likelihood estimate and a well-structured latent space
Operates on a deterministic ODE and has no stochastic element unlike diffusion models and score-based SDE approaches

Here are some example CelebA images being generated with PFGMs:

https://i.redd.it/9f826fz726w91.gif

Note that the highest resolution images explored in the paper are LSUN bedroom 256 x 256 images

Looking forward to answering any questions!

Comments

Serverside t1_itw9tz2 wrote on October 26, 2022 at 7:40 PM

#244,072

Ok, I'll bite. It looks cool from what I see in the blog. How does the model being deterministic impact (or not impact) the generative capabilities? I would think that a deterministic mapping from original data to uniform angles would not perform as well when wanting to interpolate or extrapolate (for example, like VAEs vs normal autoencoders).

SleekEagle OP t1_itwepjh wrote on October 26, 2022 at 8:10 PM

#244,678

Replying to Serverside (#244,072)

u/Serverside There are a few things to note here.

First, The non-stochasticity allows for likelihood evaluation. Second, it allows for the authors to use ODE solvers (RK45 in the case of PFGMs) instead of SDE solvers, potentially combined with an application-specific method like Langevin MCMC for score-based models. Further, for diffusion there are many discrete time steps that need to be evaluated in series (at least for the usual discrete time diffusion models). The result is that PFGMs are faster than these stochastic methods. Lastly, the particular ODE for PFGMs has weaker norm-time correlation than other ODEs, in turn making the sampling more robust.

As for the deterministic mapping, it is actually the reason (at least in part) that e.g. interpolations work for PFGMs. The mapping is a continuous transformation, mapping "near points to near points" by definition. I think the determinism ensures that interpolated paths in the latent space will transform to a well-behaved path in the data space, whereas a stochastic element would very likely break this path. The stochasticity in VAEs is useful to learn parameters of a distribution and is required to sample from this distribution, but once the point is sampled it is (usually) deterministically mapped back to the data space iirc.

Serverside t1_itwm4o3 wrote on October 26, 2022 at 8:57 PM

#245,541

Replying to SleekEagle (#244,678)

I see. Thanks for the in depth response, and your answers make sense. The last follow up question I have is: Do PFGMs preserve the distribution of the data, or since it is transformed to a uniform distribution, is the original distribution of the data lost?

I know the other stochastic generative models usually try to match or preserve the distribution of data. Maybe you also somewhat answered this already in your second paragraph, but I just wanted to make sure I understood.

Again, your blog and code look neat. I look forward to toying with them on some data of my own.

SleekEagle OP t1_itxpvth wrote on October 27, 2022 at 1:47 AM

#250,164

Replying to Serverside (#245,541)

My pleasure! I'm not sure I understand exactly what you're asking, could you try to rephrase it? In particular, I'm not sure what you mean by preservation of the data distribution.

Maybe this will help answer: Given an exact Poisson field generated by a continuous data distribution, PFGMs provide an exact deterministic mapping to/from a uniform hemisphere. While we do not know this Poisson field exactly, we can estimate given many data points sampled from the distribution. PFGMs therefore provide a deterministic mapping between the uniform hemisphere and the distribution that corresponds to the learned empirical field, but not exactly to the data distribution itself. Given a lot of data, though, we expect this approximate distribution to be very close to the true distribution (universal to all generative models)

Thanks for reading! I did have a little trouble getting the repo working on my local machine, so I might expect some trouble if I were you. I reached out to the authors while writing this article and I believe they are planning on continuing research into PFGMs, so don't forget to keep an eye out for future developments!

Serverside t1_itxsq3p wrote on October 27, 2022 at 2:09 AM

#250,541

Replying to SleekEagle (#250,164)

Yeah you essentially answered what I was asking. I was basically asking if the output of a trained PFGM matched (or closely estimated) the empirical distribution of the training data. Since the end product of the “diffusion” was said to be a uniform distribution and the equations were ODEs not SDEs, I was having trouble wrapping my head around how the PFGM could be empirically matching the distribution. Thanks for answering all the questions!

andrew21w t1_itxt7q6 wrote on October 27, 2022 at 2:13 AM

#250,588

This looks neat as hell. Is there more literature where can I learn about this?

ThatInternetGuy t1_itxvv27 wrote on October 27, 2022 at 2:34 AM

#250,916

I'm not surprised at all. Simulated annealing is an important optimization technique that is inspired by the metal annealing process.

[deleted] t1_ity8r9j wrote on October 27, 2022 at 4:31 AM

#252,481

Replying to ThatInternetGuy (#250,916)

[deleted]

Ulfgardleo t1_ityn2zg wrote on October 27, 2022 at 7:28 AM

#253,902

i am not sure i buy the 10-20x faster claim as stated in the form of this post (but i did not have time to read the link, yet). This is because there are diffusion models that claim to be 10-40x faster than diffusion models.

SleekEagle OP t1_itzjvvo wrote on October 27, 2022 at 1:38 PM

#258,011

Replying to Serverside (#250,541)

Got it! Yeah the ultimate crux of it is the proof that any continuous compact distribution has a field that approaches uniform flux density at infinity

SleekEagle OP t1_itzk3mv wrote on October 27, 2022 at 1:39 PM

#258,040

Replying to Ulfgardleo (#253,902)

That claim is pulled directly from the paper. While I have not verified it myself, I would not be surprised if it were true if they are using discrete time diffusion models. I'm not very familiar with continuous time diffusion models, however, so maybe it would be a lot closer in that case!

SleekEagle OP t1_itzkagl wrote on October 27, 2022 at 1:41 PM

#258,076

Replying to andrew21w (#250,588)

This is the first paper on this approach! I spoke to the authors and they're planning on continuing research down this avenue (personally I think dropping a PFGM as the base generator for Imagen and then keeping Diffusion Models for the super resolution chain would be very cool), so be on the lookout for more papers!

Mefaso t1_itzlyga wrote on October 27, 2022 at 1:53 PM

#258,344

Here's the paper link for people who prefer the original reference

https://arxiv.org/abs/2209.11178

Ulfgardleo t1_itzmz4u wrote on October 27, 2022 at 2:00 PM

#258,496

Replying to SleekEagle (#258,040)

my comment was just a remark saying that "diffusion model" by itself is not informative, because there have been several approaches that already brought a lot of speed-up. The standard diffusion model is not state-of-the-art anymore.

andrew21w t1_itzpm2n wrote on October 27, 2022 at 2:18 PM

#258,847

Replying to SleekEagle (#258,076)

I would love to see this being used for image to image transformations. I can see plenty of potential for this

SleekEagle OP t1_itzqvcr wrote on October 27, 2022 at 2:27 PM

#259,021

Replying to andrew21w (#258,847)

True! Although it's unclear how to deal with img2img with different sizes

SleekEagle OP t1_itzrai9 wrote on October 27, 2022 at 2:30 PM

#259,082

Replying to Ulfgardleo (#258,496)

Oh, gotcha! Good point - I'll have to take another look but I think it might've been DDPMs from 2020

SleekEagle OP t1_itzrcnd wrote on October 27, 2022 at 2:30 PM

#259,093

Replying to Mefaso (#258,344)

Thanks!

andrew21w t1_itzrk2g wrote on October 27, 2022 at 2:32 PM

#259,117

Replying to SleekEagle (#259,021)

Honestly, if one could deal with that, in theory you could do that with latent codes, essentially making a better GAN or whatnot (assuming I got how these models work of course)