UncleVesem1r

UncleVesem1r t1_it801hk wrote

Thank you! My intuition was that score matching + Langevin doesn’t have a forward diffusion process, which probably contributed to why there has to be a step size (right?) and I agree that LD seemed to be an easy way to use the scores.

How about the SDE formulation of score matching? They also claimed that DDPM is a variance preserving discretization SDE. As far as I can tell, the reverse SDE is a closed form solution of forward SDE and doesn’t require extra hyper parameters.

1

UncleVesem1r t1_it5rffe wrote

I see! I understand why DDPM is good now. I should go back to the paper and pay more attention to the KL divergence part of it.

If I could borrow a few more minutes of your time, could you explain more about what's not as good about score matching?

So to be explicit, my understanding Langevin sampling is correct, i.e., if there's a model that can accurately model the score function, one should be able to recover the true data distribution. If this is true, then I guess the criticism regarding SM is about its objective function, i.e., there's no guarantee that it leads to accurate score function? But aren't the score matching algorithms (denoising, projection) supposed to be able to solve the objective function involving grad_x log p(x)?

Or perhaps Langevin sampling is the problem. The paper does say that with small enough noise and enough steps, we would end up in an exact sample from the data set. Yet if we don't have small enough noise and enough steps, perhaps we end up somewhere but it doesn't guarantee to be the true data distribution?

I really appreciate this! Thanks again.

1

UncleVesem1r t1_it5bxzm wrote

Thank you for the reply. It was very helpful.

​

>Score matching on the other hand, has no theoretical guarantee that it will produce something accurate enough to be used for Langevin sampling.

Sorry if I'm being dense. Could you expound on this? Or could you be more explicit regarding which part of DDPM provides such theoretical guarantee, while SBM fails to do so, perhaps providing eq numbers from the papers? I'm fairly new to this and it's hard for me to parse all the equations and understand which part is fluff and which is the real meat. Thank you very much!

1