Viewing a single comment thread. View all comments

CollapseKitty t1_je8wa3w wrote

Modern LLMs (large language models), like ChatGPT, use what's called reinforcement learning from human feedback, RLHF, to train a reward model which then is used to train the language model.

Basically, we attempt to instill an untrained model with weights selected through human preference (which looks more like a cat? which sentence is more polite?). This then automates the process and scales it to superhuman levels which are capable of training massive models like ChatGPT with hopefully something close to what the humans initially intended.

2