AnOnlineHandle t1_jaesse4 wrote on February 28, 2023 at 10:32 PM

Reply to comment by pawsibility in [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152

The CLIP model in the Stable Diffusion 1.5 package is 480mb according to my directory where it was unpackaged by diffusers, though I don't know how that translate into parameter count.