Submitted by **seraphaplaca2** t3_122fj05
in **MachineLearning**

In the last few days I had a new thought. I don't know if it is possible or already done somewhere? Is it possible to merge the weights of two transformer models like they do with merging stable diffusion models? Like can I merge for example BioBert and LegalBert and get a model that can do both?

tdgrost1_jdqbgqy wrotethe model merging offered by some stable diffusion UIs do not merge the weights of a network! They merge the denoising results for a single diffusion step from 2 different denoisers, this is very different!

Merging the weights of two different models does not produce something functional in general, it also can only work for 2 models with exactly the same structure. It certainly does not "mix their functionality".