blazejd OP t1_ix7kazf wrote
Reply to comment by Kylaran in [D] Why do we train language models with next word prediction instead of some kind of reinforcement learning-like setup? by blazejd
Glad to hear a non-ML perspective on it! Initializing with language models and then using RL for feedback makes a lot of sense. Could you share any particular papers that I could look into?
Viewing a single comment thread. View all comments