Viewing a single comment thread. View all comments

filipposML t1_j90m8x0 wrote

Maybe you are interested in Tishby's rate distortion. E.g. in this paper they do an analysis of the behaviour of mutual information in the hidden layers as a neural network is trained to convergence.

11