Submitted by neuralbeans t3_10puvih in deeplearning
I'd like to train a neural network where the softmax output has a minimum possible probability. During training, none of the probabilities should go below this minimum. Basically I want to avoid the logits from becoming too different from each other so that none of the output categories are ever completely excluded in a prediction, a sort of smoothing. What's the best way to do this during training?
like_a_tensor t1_j6mcv1v wrote
I'm not sure how to fix a minimum probability, but you could try softmax with a high temperature.