Viewing a single comment thread. View all comments

kromem t1_jdkfj5w wrote

> The model underlying Dolly only has 6 billion parameters, compared to 175 billion in GPT-3, and is two years old, making it particularly surprising that it works so well. This suggests that much of the qualitative gains in state-of-the-art models like ChatGPT may owe to focused corpuses of instruction-following training data, rather than larger or better-tuned base models.

The exciting thing here is the idea that progress in language models is partially contagious backwards to earlier ones by using newer models to generate the data to update older ones not in pre-training but in fine tuning (and I expect, based on recent research into in context learning, this would extend into additional few shot prompting).

I'm increasingly wondering if we'll see LLMs develop into rolling releases, particularly in the public sector. Possibly with emphasis on curating the data set for fine tuning with a platform agnostic stance towards the underlying pre-trained model powering it.

In any case, it looks more and more like the AI war between large firms will trickle down into open alternatives whether they'd like it to or not.

38

WarAndGeese t1_jdl5aq6 wrote

That would be pretty nuts and pretty cool. It's still a weird concept, but if it becomes like an operating system that you update, that would be a thing.

9

visarga t1_jdlonpq wrote

One way to speed this up is to make an extension for voluntary contributions of LLM interactions to open source. A user decides when a chat deserves to be donated to open source and pushes a button to share. I don't think OpenAI can object to users donating their data.

7

SDRealist t1_jdmdwkl wrote

Users could certainly donate their questions, but I believe the TOS for ChatGPT forbid using the generated output to train competing models (at least for commercial purposes).

8