Hello,

I am working on a project in which I'm detecting cavities in X-rays.

The dataset I have is pretty limited (~100 images). Each X-ray has a black and white mask that shows where in the image are the cavities.

I'm trying to improve my results.

What I've tried so far:

different loss functions: BCE, dice loss, bce+dice, tversky loss, focal tversky loss
modifying the images' gamma to make the cavities more visible
trying out different U-Nets: U-net, V-net, U-net++, UNET 3+, Attention U-net, R2U-net, ResUnet-a, U^2-Net, TransUNET, and Swin-UNET

None of the new U-nets that I've tried improved the results. Probably because they are more suited for a larger dataset.

I'm now looking for other things to try to improve my results. Currently my network is detecting cavities, but it has trouble with the smaller ones.

Comments

You must log in or register to comment.

trajo123 t1_je7dgjz wrote on March 29, 2023 at 11:03 PM

100 images??? Folks, neural nets are data hungry, if you don't have reams of data, don't fiddle with architectures, definitely not at first. The first thing to do when data is limited is to use pre-trained models. Then do data augmentation and only then look at other things like architectures and losses if you really have nothing better to do with your time.

SMP offers a wide variety of segmentation models with the option to use pre-trained weights.

viertys OP t1_je9mjxr wrote on March 30, 2023 at 12:46 PM

Thank you a lot! I will try SMP

Tight-Lettuce7980 t1_jea4ojl wrote on March 30, 2023 at 3:03 PM

How about medical images, which are more difficult to obtain due to privacy issues? I don't think it's easy to get for example 1000+ images. Would 300 - 700 or so be sufficient?

trajo123 t1_jebbxaf wrote on March 30, 2023 at 7:43 PM

Sufficient to train a model from scratch? Unlikely. Sufficient to fine-tune a model pre-trained on 1million+ images (imagenet, etc)? Probably yes. As mentioned, some extra performance can be squeezed out with some smart data augmentation.

BrotherAmazing t1_je7vj9v wrote on March 30, 2023 at 1:19 AM

People saying get more than 100 images are right (all else being equal, yes, get more images!) but you likely can make good progress without as many images for your problem with clever augmentation and a smaller network.

Here’s why:

You only have to detect cavities. It’s not some 1,000-class semantic segmentation problem.
You should be working with single channel grayscale images, and not pixels that naturally come in 3-channel RGB color.
This is X-ray data just of teeth, so you don’t have nearly the amount of complex fine-detailed textures and patterns (with colors) that are exhibited in more general RGB optical datasets of all sorts of objects and environments.

Of course for a real operational system that you will use in commercial products you will need to get far more than 100 images. However, for a simple research problem or prototype demo, you can likely show good results and feasibility (without overfitting, yes) on your dataset with a smaller net and clever augmentation.

viertys OP t1_je9nno8 wrote on March 30, 2023 at 12:55 PM

I didn't mention it in the post, but I'm using the albumentations module. I rotate, shift, rotate, blur, horizontal flip, downscale and use gauss noise. I get around 400 images after doing this. Is there anything you would suggest?
I have an accuracy of 98.50 and I have dice of around 0.30-0.65 in each image

And yes, the images are grayscale and they are cropped around the teeth area, so only that part of the X-ray remains.

MadScientist-1214 t1_je6o0st wrote on March 29, 2023 at 8:08 PM

Most new architectures based on U-Net do not actually work. Researchers need papers to get published, so they introduce leakage or optimize the seed. Segmentation papers in journals like CVPR are of better quality.

m98789 t1_je8gdwb wrote on March 30, 2023 at 4:12 AM

CVPR is not a journal

BreakingCiphers t1_je7dlg5 wrote on March 29, 2023 at 11:04 PM

While testing models and playing with hyperparams can be fun, the real problem is that you are trying to apply deep learning to 100 images.

Get more images.

Adventurous-Mouse849 t1_je7gyqe wrote on March 29, 2023 at 11:29 PM

And also data augmentation. Rotation, cropping, zooming. This is essential for data scarcity in medical imaging.

viertys OP t1_je9mlvj wrote on March 30, 2023 at 12:47 PM

Adventurous-Mouse849 t1_jedi4wq wrote on March 31, 2023 at 5:48 AM

For augmentation that’s all bases covered. For more high-level or fully generative tasks I would also suggest mix-match (convex combo between similar samples). But you can’t justify that here bc you would have to relabel. Ultimately this does come down to too few images. If there’s a publicly available pretrained CT segmentation model you could fine-tune it to your task, or distill it’s weights to your model… just make sure they did a good job in the first place.

Also some other notes: I’d suggest sticking with distribution losses ie cross entropy. U-Net is sensitive to normalization so I’d also suggest training with and without normalized inputs.

azorsenpai t1_je6hjpu wrote on March 29, 2023 at 7:27 PM

Is there any reason you're really restraining to a Unet based model ? I'd recommend testing different architectures such as DeepLab V3 or FPN and see whether stuff improves. If it doesn't I'd recommend looking to your data and the quality of the ground truth as with only 100 data points you should be very much limited by the information contained in your data.

If the data is clean I'd recommend using some kind of ensemble method, this might be overkill, especially with heavy models but having multiple models with random initializations infer on a same input generally gives a few more points of accuracy/dice so if you really need it , this is an option.

viertys OP t1_je6peyv wrote on March 29, 2023 at 8:17 PM

I started with U-Net, but I'm open to other architectures. I will try out DeepLab V3, thank you!

I believe the data is generally clean. Sadly, I can't get more data as all the datasets used in the research papers that I've read are private.

elbiot t1_je8iym9 wrote on March 30, 2023 at 4:38 AM

Looks like this was trained on just 150 x-rays and does very well: https://paperswithcode.com/paper/xnet-a-convolutional-neural-network-cnn

Edit: did you look for pre-existing solutions? This was like the second google result. If I were you I'd be looking for public datasets I could use for pretraining and then finetune on my data

deep-yearning t1_je710wy wrote on March 29, 2023 at 9:33 PM

What accuracy (Dice?) are you getting? 100 training images is pretty small. Have you tried cross-validation?

viertys OP t1_je9mocy wrote on March 30, 2023 at 12:47 PM

I have an accuracy of 98.50 and I have dice of around 0.30-0.65 for each image

deep-yearning t1_je9qqrf wrote on March 30, 2023 at 1:20 PM

Accuracy is not a good metric here given the large number of true negative pixels you will get.

How large is the typical region you are trying to segment (in pixels)? If you've already done data augmentation I would also try to generate images if you can. Use a larger batch size, try different optimizers and a learning rate scheduler. How many images do not have cavities in them?

viertys OP t1_je9srha wrote on March 30, 2023 at 1:36 PM

All images have cavities in them and in general the cavities make up 5-10% of the image.

Here is an example: https://imgur.com/a/z0yeH0C The mask on the left is the ground truth and the mask on the right is the predicted one.

I'm currently using Kaggle and I can't use very large batch sizes. My batch size is 4 now. Is there an alternative to Kaggle that you would suggest?

deep-yearning t1_je9te4j wrote on March 30, 2023 at 1:40 PM

Train locally on your own machine if you have a GPU, or try using google colab if you don't. Google Colab has V100 which should fit larger batch sizes.

To be honest, given how limited the data set is and how small some of the segmentation masks are, I am not sure other architectures will be able to do any better than U-Net.

I would also try the nnU-Net which should give state-of-the-art performance, and so will give you a good idea of what's possible with the dataset that you have: https://github.com/MIC-DKFZ/nnUNet

viertys OP t1_je9u6ny wrote on March 30, 2023 at 1:46 PM

Thank you, I will try nnU-net too

currentscurrents t1_je7c29r wrote on March 29, 2023 at 10:53 PM

The architecture probably isn't the problem. You only have 100 images, that's your problem.

If you can't get more labeled data, you should pretrain on unlabeled data that's as close as possible to your task - preferably other dental x-rays. Then you can finetune on your real dataset.

dubbitywap t1_je8xewl wrote on March 30, 2023 at 7:39 AM

Do you have a git repository that we can take a look at?

itsyourboiirow t1_je7n7p8 wrote on March 30, 2023 at 12:17 AM

Others have mentioned it, but do data augmentation, crop, resize, rotate, etc. and you'll be able to increase the size of your dataset and improve results.

viertys OP t1_je9mpwr wrote on March 30, 2023 at 12:48 PM

I didn't mention it in the post but I'm using the albumentations module. I rotate, shift, rotate, blur, horizontal flip, downscale and use gauss noise. I get around 400 images after doing this. Is there anything you would suggest?

Warhouse512 t1_je86y92 wrote on March 30, 2023 at 2:47 AM

Mask2Former?

[deleted] t1_je8uyb4 wrote on March 30, 2023 at 7:04 AM

[deleted]

mofoss t1_je9ayyx wrote on March 30, 2023 at 10:51 AM

Try segformers after augmentation

NoLifeGamer2 t1_je9gi5u wrote on March 30, 2023 at 11:51 AM

I recommend using bootstrapping to create more datapoints, then approve the ones you like and add them to the dataset. Then, train based on the larger dataset.

[deleted] t1_jea6oxf wrote on March 30, 2023 at 3:18 PM

[removed]

CyberDainz t1_je6qsbb wrote on March 29, 2023 at 8:26 PM

The success of generalization for segmentation depends not only on the network configuration, but also on the augmentation and pretrain on non mask target.

try my new project Deep Roto https://iperov.github.io/DeepXTools/