Hello the community,

Since the release of diffusion models we saw many posts on that.

The one made by Lambdalabs was very interesting and fun.

However I found it personally difficult to finetune on my own data. This is why I created a repository that simplifies a bit the process https://github.com/YaYaB/finetune-diffusion.

It breaks down into several steps:

Dataset creation and how to actually create a dataset using HuggingFace's datasets library
Captioning if you do not have any using BLIP similarly to lambdalabs
Finetuning based on a script released by HuggingFace on their diffusers repository

I've added a few functionalities in the whole process:

Simplify the captioning and dataset creation in a few scripts
Finetuning can be done on a local dataset (if you do not want or can not share your dataset on HuggingFace Hub)
Validation prompts can be set at every epoch (to verify when the model begins to overfit)
Model can be uploaded to HuggingFace hub every X epochs
A script to test your model locally has been added
A dataset card template is available
A space app can be copied an modified

In the Results section of the README you'll find some examples of prompts based on a model finetuned on One Piece characters and another one on Magic cards.

Demos are available (sorry in advance for the latency I don't have a pro HuggingFace account yet):

Attached some results based on finetuning on Magic cards.

Next steps:

Dockerize everything to simplify the process
Dump the weights locally every X epochs (it takes a lot of disk space)
Add some visualization tool to play with it

Hope it can be helpful to anyone :)

Comments

hackerllama t1_isbrz8y wrote on October 14, 2022 at 7:24 PM

Hey there! Omar from Hugging Face here. Very cool project! We just granted some free GPUs so people can enjoy much faster inference. Enjoy!

alach11 t1_isd6mox wrote on October 15, 2022 at 1:37 AM

Sometimes I’m awestruck by Hugging Face. This is one of those times.

YaYaLeB OP t1_isclw50 wrote on October 14, 2022 at 10:51 PM

Thanks a lot!!

ebazarov t1_isaicc7 wrote on October 14, 2022 at 2:18 PM

Hahaha, awesome examples in the repo and a really interesting project. How did you get those datasets?

YaYaLeB OP t1_isamm17 wrote on October 14, 2022 at 2:47 PM

Thanks.
For anime characters: https://www.animecharactersdatabase.com/

For Magic cards: https://scryfall.com/

MasterScrat t1_isbr8km wrote on October 14, 2022 at 7:19 PM

How many epochs did you train these models for?

YaYaLeB OP t1_iscmjrp wrote on October 14, 2022 at 10:56 PM

50 epoch for the one piece (a bit too much imho), 30 for the magic. Both with 1k images and a low learning late 1e(-5)

shawarma_bees t1_isbvg3b wrote on October 14, 2022 at 7:47 PM

Thank you for sharing! Any chance your repository also supports fine tuning with class conditioning, rather than text conditioning?

YaYaLeB OP t1_isclyo2 wrote on October 14, 2022 at 10:51 PM

Not for the moment but you can create an issue on the repo I'll see what I can do about it!

thelastpizzaslice t1_isc8qqr wrote on October 14, 2022 at 9:16 PM

The One Piece one has the same issue my cartoon dreambooth has where it is made more of blurry blobs than characters with clean line edges.

YaYaLeB OP t1_iscm3ua wrote on October 14, 2022 at 10:52 PM

Yeah I totally see what you mean, my one piece dataset is not so great. I need to clean it a bit more or get better images. You can see the difference with the dataset used finetuned on Pokemon

ghoumrassi t1_isedftf wrote on October 15, 2022 at 9:48 AM

Omg thank you so much. I spent the best part of yesterday adapting the Pokemon pipeline for my use-case and needless to say it was a painful process.

Really excited to test out your fork!

YaYaLeB OP t1_isee5k8 wrote on October 15, 2022 at 9:59 AM

Totally understand you, I had the same pain ^^
Cheers mate!
Do not hesitate to raise an issue if something is not clear!