[D]: Vanishing Gradients and Resnets Submitted by Blutorangensaft t3_11wmpoj on March 20, 2023 at 4:00 PM in MachineLearning 7 comments 3
deep_alichemist t1_jd2mzoh wrote on March 21, 2023 at 12:11 PM Use any kind of normalization additionally to skip connections. ResNet alone is not enough, except if you carefully tune everything (eg. https://arxiv.org/abs/1901.09321). Permalink 2
Viewing a single comment thread. View all comments