Submitted by JClub t3_10fh79i in MachineLearning
JClub OP t1_j4z5ciu wrote
Reply to comment by JoeHenzi in [R] A simple explanation of Reinforcement Learning from Human Feedback (RLHF) by JClub
This package is pretty simple to use! https://github.com/lvwerra/trl
It supports decoder-only models like GPT and it is in the process of supporting enc-dec like T5.
JoeHenzi t1_j50pbv9 wrote
I'll take a look, thanks again. Building up a dataset, at the very least, that could be interesting to analyze or crunch. Would love to implement a GA to explore the space and have the example code from ChatGPT but need to dive deeper. As I may have mentioned on my GH comment, when trying to do predictions around parameters I end up blocking/slowing the API call so either my code is trash (likely!) or I'm trying to do too-too much at once.
On my short term list is using a T5-like model to produce summaries but I was trying to execute them at bad times, trying to make too many changes at once.
Thanks again for sharing. Enjoying playing in the space and love when you find people willing to share. (Unlike OpenAI who is slowly closing out the world to their toys).
Viewing a single comment thread. View all comments