Submitted by Individual-Cause-616 t3_10m4l0b in MachineLearning

I know there is a mathematical way to show that the two approaches of score matching models and diffusion models are the same. I wonder, if there in practice/code are the same either? I already tried to find some PyTorch implementations of score based models but didn’t find anything yet - just for diffusion models.

7

Comments

You must log in or register to comment.

royalemate357 t1_j60xuup wrote

there's an implementation of score-based models from the paper that showed how score based models and diffusion models are the same here: https://github.com/yang-song/score_sde_pytorch

imo their implementation is more or less the same as a diffusion model, except score based models would use a numerical ODE/SDE solver to generate samples instead of using the DDPM based sampling method. it might also train on continuous time, so rather than choosing t ~ randint(0, 1000) it would be t ~ rand_uniform(0, 1.)

7

Individual-Cause-616 OP t1_j60yi8b wrote

So do you think it makes a difference in practice, I.e. sampling speed and quality, convergence etc

4

royalemate357 t1_j61k9vy wrote

the speed and quality of score based/diffusion depends on what sampler you use. If youre using euler's method to solve the ODE for example, that might be slower than some of the newer methods developed for diffusion models, like tero karass' ODE solvers. AFAIK there isnt consensus on what the best sampler to use is though.

i dont think it affects training convergence much though since its more or less the same objective.

4

jimmymvp t1_j67uru2 wrote

Diffusion models are effectively score-based, there's a connection with the reversal of the forward process being Gaussian and the noise estimate, effectively you're using scores of Gaussians in the reverse process. The time variable is irrelevant in sense of scale, the discrete time and continuous time essentially do roughly the same, the difference is that one is tied to a specific discretization of the SDE and the other can be solved to arbitrary precision, it's also a difference if you take steps wrt to variance or wrt time. Essentially the continuous formulation should be the limit of the discrete one. So effectively you can take a discrete sampling method and make it a continuous SDE/ODE

2