Viewing a single comment thread. View all comments

Disastrous_Elk_6375 t1_jdm4h39 wrote

> I have seen many inaccurate claims, e.g. LLaMa-7B with Alpaca being as capable as ChatGPT

I believe you might have misunderstood the claims in Alpaca. They never stated it is as capable as ChatGPT, they found (and you can confirm this yourself) that it accurately replicates the instruction tuning. That is, for most of the areas in the fine-tuning set, a smaller model will output in the same style of davinci. And that's an amazing progress from the raw outputs of the raw models.

20

farmingvillein t1_jdnuvnf wrote

> I believe you might have misunderstood the claims in Alpaca. They never stated it is as capable as ChatGPT, they found (and you can confirm this yourself) that it accurately replicates the instruction tuning. That is, for most of the areas in the fine-tuning set, a smaller model will output in the same style of davinci.

This is a misleading summary of the paper.

They instruction tune and then compare Alpaca versus GPT-3.5, and say that Alpaca is about equal on the tasks it compares (which, to be clear, is not equivalent to a test of "broad capability").

Yes, you are right that they don't make a statement that it is categorically more capable than ChatGPT, but they do state that their model is approximately as capable as GPT3.5 (which is of course not a 1:1 to chatgpt), on the diverse set of tasks tested.

It is very much not just a paper showing that you can make it output in the same "style".

4