Viewing a single comment thread. View all comments

jayalammar OP t1_ir2im9w wrote

New Stable Diffusion models have to be trained to utilize the OpenCLIP model. That's because many components in the attention/resnet layer are trained to deal with the representations learned by CLIP. Swapping it out for OpenCLIP would be disruptive.

In that training process, however, OpenCLIP can be frozen just like how CLIP was frozen in the training of Stable Diffusion / LDM.

7