Viewing a single comment thread. View all comments

koiRitwikHai t1_ix93rg6 wrote

Why ReLU performs better than other activation functions when it is neither a differentiable function nor it is zero-centered?

6