ThatInternetGuy t1_jckmq5s wrote on March 17, 2023 at 2:58 PM

Reply to comment by CellWithoutCulture in Those who know... by Destiny_Knight

>HF-RLHF

Probably no need, since this model could piggyback on the responses generated from GPT4, so it should carry the trait of the GPT4 model with RLHF, shouldn't it?

CellWithoutCulture t1_jcmsxjq wrote on March 17, 2023 at 11:37 PM

HF-RLHF is the name of the dataset. As far as RLHF... what they did to LLaMA is called "Knowledge Distillation" and iirc usually isn't quite as good as RLHF. It's an approximation.