Viewing a single comment thread. View all comments

SleekEagle OP t1_itwepjh wrote

u/Serverside There are a few things to note here.

First, The non-stochasticity allows for likelihood evaluation. Second, it allows for the authors to use ODE solvers (RK45 in the case of PFGMs) instead of SDE solvers, potentially combined with an application-specific method like Langevin MCMC for score-based models. Further, for diffusion there are many discrete time steps that need to be evaluated in series (at least for the usual discrete time diffusion models). The result is that PFGMs are faster than these stochastic methods. Lastly, the particular ODE for PFGMs has weaker norm-time correlation than other ODEs, in turn making the sampling more robust.

As for the deterministic mapping, it is actually the reason (at least in part) that e.g. interpolations work for PFGMs. The mapping is a continuous transformation, mapping "near points to near points" by definition. I think the determinism ensures that interpolated paths in the latent space will transform to a well-behaved path in the data space, whereas a stochastic element would very likely break this path. The stochasticity in VAEs is useful to learn parameters of a distribution and is required to sample from this distribution, but once the point is sampled it is (usually) deterministically mapped back to the data space iirc.

20

Serverside t1_itwm4o3 wrote

I see. Thanks for the in depth response, and your answers make sense. The last follow up question I have is: Do PFGMs preserve the distribution of the data, or since it is transformed to a uniform distribution, is the original distribution of the data lost?

I know the other stochastic generative models usually try to match or preserve the distribution of data. Maybe you also somewhat answered this already in your second paragraph, but I just wanted to make sure I understood.

Again, your blog and code look neat. I look forward to toying with them on some data of my own.

7

SleekEagle OP t1_itxpvth wrote

My pleasure! I'm not sure I understand exactly what you're asking, could you try to rephrase it? In particular, I'm not sure what you mean by preservation of the data distribution.

Maybe this will help answer: Given an exact Poisson field generated by a continuous data distribution, PFGMs provide an exact deterministic mapping to/from a uniform hemisphere. While we do not know this Poisson field exactly, we can estimate given many data points sampled from the distribution. PFGMs therefore provide a deterministic mapping between the uniform hemisphere and the distribution that corresponds to the learned empirical field, but not exactly to the data distribution itself. Given a lot of data, though, we expect this approximate distribution to be very close to the true distribution (universal to all generative models)

Thanks for reading! I did have a little trouble getting the repo working on my local machine, so I might expect some trouble if I were you. I reached out to the authors while writing this article and I believe they are planning on continuing research into PFGMs, so don't forget to keep an eye out for future developments!

10

Serverside t1_itxsq3p wrote

Yeah you essentially answered what I was asking. I was basically asking if the output of a trained PFGM matched (or closely estimated) the empirical distribution of the training data. Since the end product of the “diffusion” was said to be a uniform distribution and the equations were ODEs not SDEs, I was having trouble wrapping my head around how the PFGM could be empirically matching the distribution. Thanks for answering all the questions!

5

SleekEagle OP t1_itzjvvo wrote

Got it! Yeah the ultimate crux of it is the proof that any continuous compact distribution has a field that approaches uniform flux density at infinity

4