Viewing a single comment thread. View all comments

madhatter09 t1_j0jtxaq wrote

There are several papers on this idea - the best one is probably On Calibration of Modern Neural Networks by Guo et al. The gist is that you want your softmax output to be the same as the probability of your prediction being correct. For your architecture they do this through something that is called temperature scaling. Why this works is a more involved topic, but you can get a better handle of what the consequences are of cross entropy using hard (1s and 0s) vs soft labels (not so much a 1 or 0).

I think then going into OOD as the others suggest would be more fruitful. The whole deal with distribution shift and then the extremes of OOD, gets very murky and detached from what happens in practice, but ultimately the goal is to have the ability to know that a mismatch of input to model is happening, vice just having low confidence.

24

vwings t1_j0l4gq9 wrote

Great question and comment! In think the first statement here is that usually the CNNs are overconfident.

One thing that the original post is looking for is calibration of the classifier on a calibration set. On the calibration set, the softmax values can be re-adjusted to be close to probabilities. This is essentially what Conformal Prediction and Platt Scaling do.

I strongly recommend this year's talk on Conformal Prediction which provides insights into these problems. Will try to find the link...

1

Extra_Intro_Version t1_j0ldo9z wrote

I have a classification model that uses conformal prediction. This has been helpful in working towards building out a high confidence dataset.

2