Viewing a single comment thread. View all comments

tdgros t1_jdqjc8q wrote

there's also weight averaging in eSRGAN that I knew about, but that always irked me. The permutation argument from your third point is the usual reason I evoke on this subject, and the paper does show why it's not as simple as just blending weights! The same reasoning also shows why blending subsequent checkpoints isn't like blending random networks.