Hi all!
I'm a fairly advanced Machine Learner, but I struggle with something that sounds rather easy.

I have a fully trained GAN and want to invert the generator. Details on the GAN below. In short, it's a fairly simple GAN, no stylegan or anything fancy.

So I sample a random latent, I pass it through the generator, I get a fake image. I then compute a metric comparing the reference image (for which I want a z for) with the fake image. I backprop this metric value to get a gradient on the latent, which I then update with an optimizer. Sounds easy enough, and "my code works" ™.

The problem is that no matter which of the following combinations of metric and optimizer I try, the fake samples do not converge to anything near the reference image. Yes, the fake image changes a little bit from the initial one, but the optimization comes to a grinding halt fairly quickly.

For metrics I tried L1 and L2 distance as well as LPIPS with VGG as the network. For optimizers I tried SGD, SGD with Momentum and Adam, also playing around with the parameters a bit.

One more thing I tried was I generated 1000 random latents and selected the one that minimizes the metric as the initial one, to try to prevent a bad initial latent that might make the method not work.

I then looked into research and found this survey on gan inversion, where table 1 points me to this work by Creswell et al., where they use a different metric / error, see their algorithm 1. But when trying to implement that, the value quickly gets NaN (even though I add a small epsilon inside the log terms).

I am at a bit of a loss here. What is the standard way of doing this? I feel like I overlook something obvious. Any hints/links/papers greatly appreciated!

GAN details: I trained using the code from https://github.com/lucidrains/lightweight-gan, image size is 256, attn-res-layers is [32,64], disc_output_size is 5 and I trained with AMP.

Comments

CatalyzeX_code_bot t1_iwwvzlz wrote on November 18, 2022 at 11:31 PM

#572,940

Found relevant code at https://github.com/zhoubolei/awesome-generative-modeling + all code implementations here

To opt out from receiving code links, DM me

autoencoder t1_iwwyabp wrote on November 18, 2022 at 11:49 PM

#573,112

> the value quickly gets NaN

Sounds like numerical issues. That could be caused by a too high learning rate.

What does your training error look like across iterations? If it jumps all over the place (increasing a lot maybe), then it's too high, as the step sizes overshoot their targets repeatedly.

UltimateGPower t1_iwx4cie wrote on November 19, 2022 at 12:37 AM

#573,513

What about a VAE-GAN? It couples a VAE (encoder-decoder) network with a GAN by sharing weights between the generator and decoder. This way you can you use the encoder to obtain the latent variable of interest.

[deleted] t1_iwxms26 wrote on November 19, 2022 at 3:13 AM

#574,774

[deleted]

IWantAGrapeInMyMouth t1_iwxqrxp wrote on November 19, 2022 at 3:49 AM

#575,069

Replying to autoencoder (#573,112)

every time i've gotten NaN values it was due to a high learning rate so I second this.

Bitter_Campaign706 t1_iwxr165 wrote on November 19, 2022 at 3:51 AM

#575,088

Hey, I do research in gan inversion you can direct message me w/ any question. But key words you are looking for are gan priors.

Bitter_Campaign706 t1_iwy0m3s wrote on November 19, 2022 at 5:28 AM

#575,746

https://github.com/CACTuS-AI/GlowIP. Refer to the gan prior section of the code

pilooch t1_iwybbh6 wrote on November 19, 2022 at 7:44 AM

#576,364

Hey there, this is a truly difficult problem. With colleagues we do train very precise GANs on a daily basis. We've given up on inversion and latent control a couple years ago, and we actually don't need it anymore.

My raw take on this is that the GAN latent space is too compressed/folded for low level control. When finetuning image to image GANs for instance, we do get a certain fine control of the generator, though we 'see' it snap to one 'mode' or the other. Meaning, we do witness a lack of smoothness that implicitly may prevent granular control.

Haven't looked at the theoretical side of this in a while though, so you may well know better...

bloc97 t1_iwyeh1x wrote on November 19, 2022 at 8:29 AM

#576,532

Replying to pilooch (#576,364)

>the GAN latent space is too compressed/folded

I remember reading a paper that showed that GANs often folds many dimensions of the "internal" latent space into singularities, with large swathes of flat space between them (it's related to the mode collapse problem of GANs).

Back to the question, I guess that when OP is trying to invert the GAN using gradient descent, he is probably getting stuck in a local minima. Try a global search metaheuristic on top of the gradient descent like simulated annealing or genetic algorithms?

_Ruffy_ OP t1_iwymh68 wrote on November 19, 2022 at 10:29 AM

#576,987

Replying to Bitter_Campaign706 (#575,088)

Thanks for the gracious offer! I will try some reading on GAN priors and if I can't figure it out myself, I am going to reach out :)

_Ruffy_ OP t1_iwymjmq wrote on November 19, 2022 at 10:30 AM

#576,989

Replying to bloc97 (#576,532)

>Try a global search metaheuristic on top of the gradient descent like simulated annealing or genetic algorithms?

Will look into this, thanks!

_Ruffy_ OP t1_iwymlw5 wrote on November 19, 2022 at 10:31 AM

#576,996

Replying to [deleted] (#574,774)

One. So I am passing a single latent through the generator. Shouldnt make a difference though, should it? Generator is in .eval() mode.

_Ruffy_ OP t1_iwymmlv wrote on November 19, 2022 at 10:31 AM

#576,997

Replying to Bitter_Campaign706 (#575,746)

Thanks for the hint! Will check it out.

mythrowaway0852 t1_iwyuqz0 wrote on November 19, 2022 at 12:23 PM

#577,448

This particular GAN is proposed for anomaly detection but still worth looking at as it tackles a similar problem to the one you describe https://arxiv.org/abs/2009.07769

_Ruffy_ OP t1_iwyy8g2 wrote on November 19, 2022 at 1:03 PM

#577,699

Replying to mythrowaway0852 (#577,448)

Thank you!!

wowAmaze t1_iwzvyru wrote on November 19, 2022 at 5:36 PM

#580,404

Replying to bloc97 (#576,532)

Do you remember the name of that paper?

Firehead1971 t1_ix0b7h8 wrote on November 19, 2022 at 7:23 PM

#581,705

Check also for gradient explosion or vanishing gradient which might occur very early in the training!