Viewing a single comment thread. View all comments

sam__izdat t1_j99j0iu wrote

You're not likely to get much help there, unfortunately. With SD, your best bet would probably be Dreambooth, which you can get with the Huggingface diffusers library. It might be overcomplicating matters, if the site is representative of your training data, though. GANs can be notoriously difficult to train but it's probably worth a shot here -- it's a pretty basic use case. You might look into data augmentation and try a u-net with a single-channel output.

A slightly more advanced option might be ProGAN. Here's a good video tutorial if that's your thing.


CurrentlyJoblessFML t1_j99mphb wrote

I definitely think diffusion based generative ai models are a great idea. And whole heartedly agree that training GANs can be very painful. Head over to the hugging face diffusers library and you should be able to find a few models that are able to do unconditional image generation. They also have cookie cutter scripts that you can just execute to start training your model from the get go. They also have detailed instructions for how you can set up your own training data.

Although I have been working with these models for a while and I think training diffusion models can be very computationally intensive. Do you have access to a GPU cluster? If not, I’d recommend a U-Net based approach which you could train on GPU/TPUs on Google colab.

I have been using these class of models for my masters thesis and I would be happy to help in case you have any questions. Good luck! :)


snowpixelapp t1_j99zc4b wrote

In my experiments, I have found dreambooth implementation by diffusers to be not good. There are many alternatives for it though.