Submitted by alkaway t3_zkwrix in MachineLearning

I'm training a per-pixel image classification network, which, for each pixel in the image, predicts whether it is a sign for disease A or disease B. Note that a given pixel could be a sign for both disease A and disease B (this is a multi-label problem).

My question is: are the relative probabilities going to be calibrated? In other words, does it make sense to sort the NxNx2 probabilities, or are the probabilities for the two diseases (i.e. channels) not calibrated / comparable, since it is similar to solving two independent problems?

If it matters, I am using a ResNet, some fully-connected layers, and then a convolutional decoder.

Any thoughts will be much appreciated, thanks in advance!

21

Comments

You must log in or register to comment.

bimtuckboo t1_j02jss1 wrote

Easiest way to find out is to make some calibration plots with your validation set. From there, depending on what the plots look like, there are some things you can do to improve the calibration post training. Look into temperature scaling and Platt scaling.

19

alkaway OP t1_j02phdn wrote

Thanks so much for your response! Does temperature scaling change the relative ordering of the probabilities?

1

bimtuckboo t1_j02qida wrote

No it does not. It simply scales the probabilities to either all be closer to 0.5 or all be further from 0.5

1

Moderatecat t1_j02b4ph wrote

most modern deep neural nets are not well-calibrated by default. Your model output, even after normalization, can not be interpreted as probabilities unless it is well-calibrated

11

alkaway OP t1_j02oxhi wrote

Thanks so much for your response! Are you aware of any calibration methods I could try? Preferably ones which won't to long to implement / incorporate :P

2

gosnold t1_j036xfj wrote

Temperature adjustment in the softmax layer is quick and easy

4

ResponsibilityNo7189 t1_j02dzwf wrote

It's an open problem to get your network probabilities to be calibrated. First you might want to read aleatoric vs. epistemic uncertainty. https://towardsdatascience.com/aleatoric-and-epistemic-uncertainty-in-deep-learning-77e5c51f9423

MonteCarlo sampling and training have been used to get a sense of uncertainty.

Also changing the Softmax temperature to get less confident outputs might "help".

10

alkaway OP t1_j02oy3b wrote

Thanks so much for your response! Is temperature scaling the go-to calibration method I should try? Does temperature scaling change the relative ordering of the probabilities?

2

ResponsibilityNo7189 t1_j02t093 wrote

does note change the order. It will make the prediction less "stark", i.e. instead of .99 and 0.0001 0.002 0.007, you will get something like 0.75, 0.02, 0.04, 0.19 for instance. It is the easiest thing to do, but remember there isn't any "go-to" technique.

3

pm_me_your_ensembles t1_j01xzcw wrote

The two are not comparable. In a multi-class single-label problem, you do K distinct projections, one for each class, but then they are combined via softmax to give you something that resembles probabilities. Since no such function is applied, it's not possible to compare the two as they don't influence each other in any way.

However, you shouldn't treat whatever a NN outputs as a probability even if it's within [0,1] as NNs are known to be overconfident.

7

alkaway OP t1_j01zhl7 wrote

Thanks so much for your response!

This makes sense. Are you aware of any techniques that can be used to make these probabilities comparable?

I understand that the outputs shouldn't necessarily be treated as probabilities. I simply want a relative ordering of the pixels in terms of "likelihood."

3

trajo123 t1_j023qfb wrote

You could reformulate your problem to output 4 channels: "only disease A", "only disease B", "both disease A and disease B" and "no disease". This way a softmax can be applied to to these outputs, their probabilities summing to 1.

[EDIT] corrected number of classes

7

alkaway OP t1_j024u31 wrote

Thanks for your response -- This is an interesting idea! Unfortunately, I am actually training my network to predict 1000+ classes, for which such an idea would be computationally intractable...

2

trajo123 t1_j029y2r wrote

Ah, yes it doesn't really make sense for more than a couple of classes. So if you can't make your problem multi-class, have you tried any probability calibration on the model outputs? This should make them "more comparable", I think this is the best you can do with a deep learning model.

But why do you want to rank the outputs per pixel? Wouldn't some per-image aggregate over the channels make more sense?

3

alkaway OP t1_j02owfb wrote

Thanks so much for your response! Are you aware of any calibration methods I could try? Preferably ones which won't take long to implement / incorporate :P

2

trajo123 t1_j031wsx wrote

Perhaps scikit-learn's "Probability calibration" section would be a good place to start. Good luck!

2

LearnDifferenceBot t1_j02p3jr wrote

> won't to long

*too

Learn the difference here.


^(Greetings, I am a language corrector bot. To make me ignore further mistakes from you in the future, reply !optout to this comment.)

1

[deleted] t1_j023o61 wrote

[deleted]

1

alkaway OP t1_j02675d wrote

I'm not sure I understand. Are you suggesting I normalize each pixel in each NxN label-map to be mean 0 and std of 1? And then use this normalized label-map during training?

1

pm_me_your_ensembles t1_j02eijz wrote

Never mind my previous comment.

You could normalize both channels, ie for label 1, normalize the NxN tensor pixel, same for label 2.

1

SlowFourierT198 t1_j02nin5 wrote

Depending on the problem you may use Bayesian Neural Networks where you fit a distribution over the weights they are better calibrated but also expensive. There exists some theory on lower cost ways to make the model better calibrated / uncertainty aware. One direction is using Gaussian Process approximations an other is for example PostNet. The overal topic you can search for is uncertainty quantification

4

alkaway OP t1_j02pj3l wrote

Thanks so much for your response! Will take a look.

2

Red-Portal t1_j02wvcs wrote

With deep neural networks, I would say conformal predictions are the best way to get uncertainty estimates.

1

CommunismDoesntWork t1_j03qv7i wrote

Why do you need probabilities? You'd be better off spending more time on making your model more accurate period, even if it can be confidently wrong sometimes.

0