Viewing a single comment thread. View all comments

graham_fyffe t1_j4ne8yf wrote

Oh and by the way, using human ratings of the model output is exactly how ChatGPT is trained. Human-in-the-loop reinforcement learning.

4

SoylentRox t1_j4neivp wrote

Correct but this was done at a small scale by chatGPT employees. I am saying we look at every novel that has data on its sales, every story on a site that has metrics of views or other measurements of quality and popularity, etc.

This might give the machine more information on what elements work that people like. Maybe enough to construct good stories.

3