FerretDude t1_izyu3ka wrote on December 12, 2022 at 9:17 PM

Reply to comment by cfoster0 in [R] Illustrating Reinforcement Learning from Human Feedback (RLHF) by robotphilanthropist

RLHF is a bit tricky because you have to either work with data vendors or groups that have access to feedback data. Eventually we'll rely more on crowd sourcing I think.

FerretDude t1_izs8wj1 wrote on December 11, 2022 at 1:49 PM

Reply to comment by cfoster0 in [R] Illustrating Reinforcement Learning from Human Feedback (RLHF) by robotphilanthropist

Not allowed to share, many groups are looking into using RLHF in production though

FerretDude t1_izoa26g wrote on December 10, 2022 at 4:36 PM

Reply to comment by cfoster0 in [R] Illustrating Reinforcement Learning from Human Feedback (RLHF) by robotphilanthropist

It's already being used in production with a number of our partners. We have some chonky models coming out really soon. Expect things well into the tens of billions in the coming months.

FerretDude t1_izka011 wrote on December 9, 2022 at 6:50 PM

Reply to [R] Illustrating Reinforcement Learning from Human Feedback (RLHF) by robotphilanthropist

Team lead at Carper happy to answer questions

FerretDude OP t1_it8sw9v wrote on October 21, 2022 at 7:38 PM

Reply to comment by Ok-Zombie2406 in [D] Discussion Panel for FOSS Instruct by FerretDude

Don't think I understand the question...

FerretDude OP t1_it6yum2 wrote on October 21, 2022 at 11:49 AM

Reply to comment by ivalm in [D] Discussion Panel for FOSS Instruct by FerretDude

Ohhh that's a great idea !

FerretDude OP t1_it363tv wrote on October 20, 2022 at 4:26 PM

Reply to comment by visarga in [D] Discussion Panel for FOSS Instruct by FerretDude

Yeah I think a more general format for information extraction could potentially be useful