jimmymvp t1_j71bvhf wrote on February 3, 2023 at 10:47 AM

Reply to comment by based_goats in [D] Normalizing Flows in 2023? by wellfriedbeans

The problem with diffusion from an SDE view is that you still don't have exact likelihoods because you're again not computing the exact Jacobian to make it tractable and you have ODE solving errors. People mostly resolve to Hutchinson trace estimator, otherwise it would be too expensive to compute, so I don't think that diffusion in this way is going to enter the MCMC world anytime soon.

based_goats t1_j72jd9z wrote on February 3, 2023 at 4:44 PM

There are some papers showing diffusion working better for high-dimensional data in likelihood free inference, even just using an elbo bound. Can dig up later if wanted

jimmymvp t1_j75qyff wrote on February 4, 2023 at 7:21 AM

Would be interested in that yes

based_goats t1_j77cejz wrote on February 4, 2023 at 5:10 PM

Here's one using GANs, so not using an explicit likelihood: https://arxiv.org/abs/2203.06481

Here's a workshop paper applying score-based models: https://arxiv.org/abs/2209.14249

badabummbadabing t1_j76tfqt wrote on February 4, 2023 at 2:59 PM

Fully agree from a technical perspective with you.

The difference is that at best, you only get the likelihood under your model of choice. If that happens to be a bad model of reality (which I'd argue is the case more often than not with NFs), you might be better off just using some approximate likelihood (or ELBO) of a more powerful model.

But I am not an expert in MCMC models, so I might be talking out of my depth here. I was mainly using these models for MAP estimation.

jimmymvp t1_j7aend6 wrote on February 5, 2023 at 8:37 AM

Indeed, if your model is bad at modeling the data there's not much use in computing the likelihoods. If you want to just sample images that look cool, you don't care that much about likelihoods. However, there are certain use-cases where we care about exact likelihoods, estimating normalizing constants and providing guarantees for MCMC. Granted, you can always run MCMC with something close to a proposal distribution. However, obtaining nice guarantees on convergence and mixing times (correctness??) is difficult then, I don't know how are you supposed to do this when using a proposal for which you can't evaluate the likelihood. Similarly when you talk about importance sampling, you can only obtain correct weights if you have the correct likelihoods, otherwise it's approximate, not just in the model but also in the estimator.

This is the way I see it at least, but I'll be sure to read the aforementioned paper. I'm also not sure how much having the lower bound hurts you in estimation.