sjd96 t1_j32lyay wrote
I'm gonna ignore OP's condescending tone for a moment and think that theoretically it might be possible to invert a given target image (i.e. find the input noise which generates that image) using an optimization process, by backpropping through the model. i.e., something like
target = load_tensor('mona_lisa.png')
prompt = clip_encode('a painting of a woman')
z = torch.randn(...).requires_grad_()
while not converged :
z.grad = None
pred = run_pretrained_latent_diffusion(prompt, z)
loss = MSE(pred - target) # or whatever perceptual loss
loss.backward()
z = z - 0.01 * z.grad ## or use your favorite optimizer here
plt.imshow(z) ## recovered noise that will generate mona_lisa.png when prompted with `a painting of a woman`
What do others think?
Agreeable-Run-9152 t1_j33wpfm wrote
I thought it wasnt about the latent Code but the Training Set?
Viewing a single comment thread. View all comments