Submitted by michaelthwan_ai t3_121domd in MachineLearning
DigThatData t1_jdmvjyb wrote
Reply to comment by michaelthwan_ai in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai
dolly is important precisely because the foundation model is old. they were able to get chatgpt level performance out of it and they only trained it for three hours. just because the base model is old doesn't mean this isn't recent research. it demonstrates:
- the efficacy of instruct finetuning
- that instruct finetuning doesn't require the worlds biggest most modern model or even all that much data
dolly isn't research from a year ago, it was only just described for the first time a few days ago.
EDIT: ok I just noticed you have an ERNIE model up there so this "no old foundation models" thing is just inconsistent.
Viewing a single comment thread. View all comments