Viewing a single comment thread. View all comments

currentscurrents t1_jdaq9xo wrote

Right, but you're still loading the full GPT4 to do that.

The idea is that domain-specific chatbots might have better performance at a given model size. You can see this with StableDiffusion models, the ones trained on just a few styles have much higher quality than the base model - but only for those styles.

This is basically the idea behind mixture of experts.

2