Viewing a single comment thread. View all comments

UncleVesem1r t1_it801hk wrote

Thank you! My intuition was that score matching + Langevin doesn’t have a forward diffusion process, which probably contributed to why there has to be a step size (right?) and I agree that LD seemed to be an easy way to use the scores.

How about the SDE formulation of score matching? They also claimed that DDPM is a variance preserving discretization SDE. As far as I can tell, the reverse SDE is a closed form solution of forward SDE and doesn’t require extra hyper parameters.

1

dasayan05 t1_it95xq7 wrote

IMO, forward diffusion process isn't really a "process" -- it's need not be sequential, it's parallelizable. The sole purpose of forward process is simulating noisy data from a set of "noisy data distributions" crafted with a known set of noise-scales -- that's it. SBM and DDPM both have this process. For SBMs, it is again a heuristic HP to choose the correct largest scale so that it can overpower the data variance and reach an uninformative prior. For DDPM, it always reaches the prior due to the way noise-scales and attenuation coefficients are computed from \beta_t.

Agree with your second part. SDE formulation is good -- it basically brings SBMs into a more stronger theoretical framework. SDEs offer a reverse process which is analytic where the score naturally appears -- i.e. again not much HP.

1