Viewing a single comment thread. View all comments

wojapa t1_jdl23pj wrote

Did they use RLHF?

3

A1-Delta t1_jdl325g wrote

GPT-J-6B fine tuned on Alpaca’s instruction dataset.

4

gamerx88 t1_jdmrlhh wrote

No, check their git repo. They used HF transformer's AutoFromCausalLM in their training script. It's supervised fine-tuning.

3