daking999 t1_ix5hbok wrote
While there are plenty of good responses here I want to add that I don't think you're your idea is dumb. Taking a pre-trained LLM and using it to initialize an RL agent that tries to maximize upvotes when commenting on reddit would be pretty interesting.
Nameless1995 t1_ix6hzd0 wrote
Yeah, that would be an interesting scalable way to get human feedback. Perhaps, someone is already doing it.
blazejd OP t1_ix7jusm wrote
Interesting idea, but it would probably turn into something similar to Yannic Kilcher's 4chan model which was super toxic because people give the most upvote in highly controversial topics.
daking999 t1_ix7livm wrote
I don't think so, that's not what gets upvoted on reddit (for the most part, on the popular subreddits). It would be moderate/left-leaning. It might even learn to be funny.
Viewing a single comment thread. View all comments