Submitted by _Ruffy_ t3_yyxsxv in MachineLearning

Hi all!
I'm a fairly advanced Machine Learner, but I struggle with something that sounds rather easy.

I have a fully trained GAN and want to invert the generator. Details on the GAN below. In short, it's a fairly simple GAN, no stylegan or anything fancy.

So I sample a random latent, I pass it through the generator, I get a fake image. I then compute a metric comparing the reference image (for which I want a z for) with the fake image. I backprop this metric value to get a gradient on the latent, which I then update with an optimizer. Sounds easy enough, and "my code works" ™.

The problem is that no matter which of the following combinations of metric and optimizer I try, the fake samples do not converge to anything near the reference image. Yes, the fake image changes a little bit from the initial one, but the optimization comes to a grinding halt fairly quickly.

For metrics I tried L1 and L2 distance as well as LPIPS with VGG as the network. For optimizers I tried SGD, SGD with Momentum and Adam, also playing around with the parameters a bit.

One more thing I tried was I generated 1000 random latents and selected the one that minimizes the metric as the initial one, to try to prevent a bad initial latent that might make the method not work.

I then looked into research and found this survey on gan inversion, where table 1 points me to this work by Creswell et al., where they use a different metric / error, see their algorithm 1. But when trying to implement that, the value quickly gets NaN (even though I add a small epsilon inside the log terms).

I am at a bit of a loss here. What is the standard way of doing this? I feel like I overlook something obvious. Any hints/links/papers greatly appreciated!

GAN details: I trained using the code from https://github.com/lucidrains/lightweight-gan, image size is 256, attn-res-layers is [32,64], disc_output_size is 5 and I trained with AMP.

45

Comments

You must log in or register to comment.

autoencoder t1_iwwyabp wrote

> the value quickly gets NaN

Sounds like numerical issues. That could be caused by a too high learning rate.

What does your training error look like across iterations? If it jumps all over the place (increasing a lot maybe), then it's too high, as the step sizes overshoot their targets repeatedly.

33

IWantAGrapeInMyMouth t1_iwxqrxp wrote

every time i've gotten NaN values it was due to a high learning rate so I second this.

14

UltimateGPower t1_iwx4cie wrote

What about a VAE-GAN? It couples a VAE (encoder-decoder) network with a GAN by sharing weights between the generator and decoder. This way you can you use the encoder to obtain the latent variable of interest.

18

pilooch t1_iwybbh6 wrote

Hey there, this is a truly difficult problem. With colleagues we do train very precise GANs on a daily basis. We've given up on inversion and latent control a couple years ago, and we actually don't need it anymore.

My raw take on this is that the GAN latent space is too compressed/folded for low level control. When finetuning image to image GANs for instance, we do get a certain fine control of the generator, though we 'see' it snap to one 'mode' or the other. Meaning, we do witness a lack of smoothness that implicitly may prevent granular control.

Haven't looked at the theoretical side of this in a while though, so you may well know better...

15

bloc97 t1_iwyeh1x wrote

>the GAN latent space is too compressed/folded

I remember reading a paper that showed that GANs often folds many dimensions of the "internal" latent space into singularities, with large swathes of flat space between them (it's related to the mode collapse problem of GANs).

Back to the question, I guess that when OP is trying to invert the GAN using gradient descent, he is probably getting stuck in a local minima. Try a global search metaheuristic on top of the gradient descent like simulated annealing or genetic algorithms?

9

_Ruffy_ OP t1_iwymjmq wrote

>Try a global search metaheuristic on top of the gradient descent like simulated annealing or genetic algorithms?

Will look into this, thanks!

2

wowAmaze t1_iwzvyru wrote

Do you remember the name of that paper?

1

Bitter_Campaign706 t1_iwxr165 wrote

Hey, I do research in gan inversion you can direct message me w/ any question. But key words you are looking for are gan priors.

7

_Ruffy_ OP t1_iwymh68 wrote

Thanks for the gracious offer! I will try some reading on GAN priors and if I can't figure it out myself, I am going to reach out :)

3

Firehead1971 t1_ix0b7h8 wrote

Check also for gradient explosion or vanishing gradient which might occur very early in the training!

1

[deleted] t1_iwxms26 wrote

[deleted]

0

_Ruffy_ OP t1_iwymlw5 wrote

One. So I am passing a single latent through the generator. Shouldnt make a difference though, should it? Generator is in .eval() mode.

1