Is it possible to merge transformers? [D] Submitted by seraphaplaca2 t3_122fj05 on March 26, 2023 at 8:16 AM in MachineLearning 12 comments 12
locomoto00 t1_jdr34oo wrote on March 26, 2023 at 3:06 PM For some models you can simply average the model weights: see https://arxiv.org/pdf/2208.03306.pdf%7D Permalink 1
Viewing a single comment thread. View all comments