YaYaLeB t1_isb71l6 wrote on October 14, 2022 at 5:04 PM

#103,381

Very nice :) (Love your Vader)
I add some issues with their script and switched to (https://github.com/huggingface/diffusers/tree/main/examples/text_to_image). I've adapted their script and did the same thing for Magic Card and One Piece Character if that interests you https://github.com/YaYaB/finetune-diffusion

OnlineGrab OP t1_isc9beb wrote on October 14, 2022 at 9:20 PM

#105,372

Replying to YaYaLeB (#103,381)

Thanks, wish I knew about that repository before!

Out of curiosity did you you have to pay to host your demos on HuggingFace? I looked around for some free options with GPUs but only found Google Colab which isn't very convenient for Gradio apps.

YaYaLeB t1_iscmu6c wrote on October 14, 2022 at 10:58 PM

#106,074

Replying to OnlineGrab (#105,372)

Nop you can host your demo without paying (for the moment I suppose) however you'll have a cpu (very low inference time). If you upload your model to HuggingFace feel free to copy paste the space and modify the verbatim + the model path for your repo (https://huggingface.co/spaces/YaYaB/text-to-onepiece). It is merely a copy paste of the one made by lambdalabs, did not get time to make something more personal ^^

OnlineGrab OP t1_ise9hnc wrote on October 15, 2022 at 8:50 AM

#108,605

Replying to YaYaLeB (#106,074)

Looks like you were lucky enough to be granted some A10G GPUs :)

Thanks for the tips, I uploaded it here https://huggingface.co/spaces/Gazoche/text-to-gundam

It's running on CPU so very slow, but easier to use than Colab at least.

YaYaLeB t1_isedaq2 wrote on October 15, 2022 at 9:46 AM

#108,733

Replying to OnlineGrab (#108,605)

Yep ^^

master3243 t1_isharj4 wrote on October 15, 2022 at 11:56 PM

#113,401

Impressive, how big is the dataset? Huggingface says n<2k which seems incredibly small.

Also, what is an individual sample point? A gundam image and it's name?

OnlineGrab OP t1_ishhqxx wrote on October 16, 2022 at 12:51 AM

#113,656

Replying to master3243 (#113,401)

Thanks! There's 1565 images in the datasaset. The original Pokemon project used an even smaller one (less than 1K images).

Each row is a gundam image + a text description. The original project used BLIP to auto-caption the images but that didn't really work for this dataset so instead I asked BLIP to only describe the colors and inserted them into a generic description: "A robot, humanoid, futuristic, <colors>". One could likely get better results with more fine-grained captions.

[P] Stable-Diffusion fine tuned on mechas from the anime franchise Gundam

Comments