Submitted by 0x00groot t3_y7u6gg in MachineLearning

Text-Based Real Image Editing

Code: https://github.com/ShivamShrirao/diffusers/tree/main/examples/imagic

Colab: https://colab.research.google.com/github/ShivamShrirao/diffusers/blob/main/examples/imagic/Imagic_Stable_Diffusion.ipynb

Still need to play around and tune the parameters a bit, may not work as is on every subject. Hopefully everyone can try it out now.

Input Image

A photo of Barack Obama smiling with a big grin.

131

Comments

You must log in or register to comment.

Roarexe t1_isx0dpe wrote

Awesome, thanks for sharing!

3

ThatInternetGuy t1_isxfzu2 wrote

This Shivam Shrirao guy is super fast! Took him two days to make Dreambooth scripts and now just one day to make Imagic scripts.

15

deep-yearning t1_isxlgw1 wrote

paging Automatic1111

pls implement in webui

−1

thelastpizzaslice t1_isxw4bb wrote

What is the value of having a ckpt output? Is it like dreambooth?

1

thelastpizzaslice t1_isxxsok wrote

So, to use this, I run the colab, take the ckpt and also a pt that exists somewhere presumably, drop them into AUTOMATIC1111, and then I can pose a specific photo like it's a doll/restyle it at will in AUTOMATIC1111? Am I correct in this description?

2

danquandt t1_isy30dt wrote

How different is this in practice from running img2img on regular SD? The examples shown in the paper look very similar to what you would get from img2img, as far as I can tell.

(Ps: great work on your repos! I still can't run Dreambooth on my 3080 10gb but have played around with it in Collab and it's fantastic.)

2

thelastpizzaslice t1_isy86ca wrote

I decided to copy paste the model into automatic1111 anyway. I made one based on a photo of Atul from spiritfarer with a loose description of him as "uncle frog spirit person" and it's actually the single best cartoon generator I've ever worked with. I've spent dozens of hours trying to make these things and this paper beat all of them on accident. What a time to be alive!

The author of this paper is apparently a genius who has built something better than TI or Dreambooth, and is massively understating his accomplishment.

Here's the three photos #1 is standard, #2 is dreambooth, #3 is imagic

This is Atul

5

thelastpizzaslice t1_it4lynp wrote

Does this use model v1.5 or is it still running on v1.4?

1

HuWasHere t1_itafuvj wrote

img2img even at a high init image setting doesn't necessarily respect the init image, this is far more precise. It's limited (to my knowledge) because it uses one input image, but the results are pretty incredible.

2

readyourSICP t1_itga0um wrote

Does this give the exact same output as 24gb VRAM?

1