FerretDude
FerretDude t1_izs8wj1 wrote
Reply to comment by cfoster0 in [R] Illustrating Reinforcement Learning from Human Feedback (RLHF) by robotphilanthropist
Not allowed to share, many groups are looking into using RLHF in production though
FerretDude t1_izoa26g wrote
Reply to comment by cfoster0 in [R] Illustrating Reinforcement Learning from Human Feedback (RLHF) by robotphilanthropist
It's already being used in production with a number of our partners. We have some chonky models coming out really soon. Expect things well into the tens of billions in the coming months.
FerretDude t1_izka011 wrote
Team lead at Carper happy to answer questions
FerretDude OP t1_it8sw9v wrote
Reply to comment by Ok-Zombie2406 in [D] Discussion Panel for FOSS Instruct by FerretDude
Don't think I understand the question...
FerretDude OP t1_it6yum2 wrote
Reply to comment by ivalm in [D] Discussion Panel for FOSS Instruct by FerretDude
Ohhh that's a great idea !
FerretDude OP t1_it363tv wrote
Reply to comment by visarga in [D] Discussion Panel for FOSS Instruct by FerretDude
Yeah I think a more general format for information extraction could potentially be useful
Submitted by FerretDude t3_y8y8cm in MachineLearning
FerretDude t1_izyu3ka wrote
Reply to comment by cfoster0 in [R] Illustrating Reinforcement Learning from Human Feedback (RLHF) by robotphilanthropist
RLHF is a bit tricky because you have to either work with data vendors or groups that have access to feedback data. Eventually we'll rely more on crowd sourcing I think.