Submitted by Remet0n t3_y69btw in deeplearning

I'm working on a computer vision task (segmentation) with limited train data (~ 5K samples of high res. images along with their label ).

I'm thinking about generating synthetic new image (and new synthetic label) with a diffusion model,

these synthetic data could then be used for pretraining the supervised segmentation model.

Would this be a good idea ?

1

Comments

You must log in or register to comment.

vraGG_ t1_isnyylb wrote

I am not very experienced, but do I understand that the problem is the size of the image? If so, have you heard of sahi

2

syntheticdataguy t1_iso4kdf wrote

This might work. I am not sure how to generate labels though. Do you have a method to generate labels?

2

SnowFP t1_ispmm89 wrote

So, a couple of points. I'm sure there are several computer vision experts on this sub but here are some of my opnions. Anyone please feel free to correct me.

  1. If you mean you have 5000 images for segmentation then I think this would be sufficient data to train, for example, a UNET. If you are not getting the accuracy you want, perhaps look at how other people have been segmenting images in your domain for ideas.
  2. If you mean, you have image at 5k resolution, how many images do you have? You would likely run into the problem of small data for training generative models as well. I assume you are already using domain-specific image augmentation techniques.
  3. When training a generative model (such as a diffusion model) you are inherently learning the distribution of data. If you are able to generate high integrity images using this method, is there a way you could directly use this model to perform the segmentation task? (I am not familiar with the literature of diffusion models but I know other generative models, such as GANs have been used to perform image segmentation).
  4. I'm not sure how you could also generate labels with a generative model (perhaps there are smart ways of modifying the architecture to facilitate this) in addition to the images. Perhaps other people can chime in here.

These points account for performing this segmentation task to a high accuracy for this specific task and not for developing a novel segmentation algorithm. If the latter is what you are looking for, then these points will not be very useful for you. Good luck!

1

Remet0n OP t1_isq4lei wrote

Thanks, I indeed have 5000 images. About the label, I was thinking of modifying the net to generate RGBL images instead of RGB,where L stands for label. I guess spatial info should thus be well "linked" and the net able to generate Label.

no 3 Is a good point, I'll try to digg into it. Thanks for your point of view :)

2

Remet0n OP t1_isryy89 wrote

About the label, I was thinking of modifying the net to generate RGBL images instead of RGB,where L stands for label.

I guess spatial info should thus be well "linked" and the net shoud be able to generate labels.

1