Submitted by YaYaLeB t3_y3usrj in MachineLearning

Hello the community,

Since the release of diffusion models we saw many posts on that.

The one made by Lambdalabs was very interesting and fun.

However I found it personally difficult to finetune on my own data. This is why I created a repository that simplifies a bit the process https://github.com/YaYaB/finetune-diffusion.

It breaks down into several steps:

  • Dataset creation and how to actually create a dataset using HuggingFace's datasets library
  • Captioning if you do not have any using BLIP similarly to lambdalabs
  • Finetuning based on a script released by HuggingFace on their diffusers repository

I've added a few functionalities in the whole process:

  • Simplify the captioning and dataset creation in a few scripts
  • Finetuning can be done on a local dataset (if you do not want or can not share your dataset on HuggingFace Hub)
  • Validation prompts can be set at every epoch (to verify when the model begins to overfit)
  • Model can be uploaded to HuggingFace hub every X epochs
  • A script to test your model locally has been added
  • A dataset card template is available
  • A space app can be copied an modified

In the Results section of the README you'll find some examples of prompts based on a model finetuned on One Piece characters and another one on Magic cards.

Demos are available (sorry in advance for the latency I don't have a pro HuggingFace account yet):

Attached some results based on finetuning on Magic cards.

Next steps:

  • Dockerize everything to simplify the process
  • Dump the weights locally every X epochs (it takes a lot of disk space)
  • Add some visualization tool to play with it

Hope it can be helpful to anyone :)

107

Comments

You must log in or register to comment.

hackerllama t1_isbrz8y wrote

Hey there! Omar from Hugging Face here. Very cool project! We just granted some free GPUs so people can enjoy much faster inference. Enjoy!

35

alach11 t1_isd6mox wrote

Sometimes I’m awestruck by Hugging Face. This is one of those times.

5

ebazarov t1_isaicc7 wrote

Hahaha, awesome examples in the repo and a really interesting project. How did you get those datasets?

11

YaYaLeB OP t1_isamm17 wrote

Thanks.
For anime characters: https://www.animecharactersdatabase.com/

For Magic cards: https://scryfall.com/

8

MasterScrat t1_isbr8km wrote

How many epochs did you train these models for?

3

YaYaLeB OP t1_iscmjrp wrote

50 epoch for the one piece (a bit too much imho), 30 for the magic. Both with 1k images and a low learning late 1e(-5)

3

shawarma_bees t1_isbvg3b wrote

Thank you for sharing! Any chance your repository also supports fine tuning with class conditioning, rather than text conditioning?

2

YaYaLeB OP t1_isclyo2 wrote

Not for the moment but you can create an issue on the repo I'll see what I can do about it!

1

thelastpizzaslice t1_isc8qqr wrote

The One Piece one has the same issue my cartoon dreambooth has where it is made more of blurry blobs than characters with clean line edges.

2

YaYaLeB OP t1_iscm3ua wrote

Yeah I totally see what you mean, my one piece dataset is not so great. I need to clean it a bit more or get better images. You can see the difference with the dataset used finetuned on Pokemon

1

ghoumrassi t1_isedftf wrote

Omg thank you so much. I spent the best part of yesterday adapting the Pokemon pipeline for my use-case and needless to say it was a painful process.

Really excited to test out your fork!

2

YaYaLeB OP t1_isee5k8 wrote

Totally understand you, I had the same pain ^^
Cheers mate!
Do not hesitate to raise an issue if something is not clear!

1