cfoster0 t1_izlys6v wrote on December 10, 2022 at 2:01 AM

About this bit

> At the moment, TRLX has an API capable of production-ready RLHF at the scales required for LLM deployment (e.g. 33 billion parameters). Future versions of TRLX will allow for language models up to 200B parameters. As such, interfacing with TRLX is optimized for machine learning engineers with experience at this scale.

Has TRLX been used to tune models in production already? Or if not, what did the blog post mean by "capable of production-ready RLHF"? I haven't seen any RLHF-ed models built on open source software yet, much less a 33B parameter one.

EDIT: Also hi @FerretDude

FerretDude t1_izoa26g wrote on December 10, 2022 at 4:36 PM

It's already being used in production with a number of our partners. We have some chonky models coming out really soon. Expect things well into the tens of billions in the coming months.

cfoster0 t1_izrdeii wrote on December 11, 2022 at 7:07 AM

Who? Who's even using RLHF in production yet, besides OpenAI (and maybe Cohere)?

FerretDude t1_izs8wj1 wrote on December 11, 2022 at 1:49 PM

Not allowed to share, many groups are looking into using RLHF in production though

cfoster0 t1_izuxn52 wrote on December 12, 2022 at 1:00 AM

Did y'all stop doing work out in the open? That's a shame. End of an era, I guess.

FerretDude t1_izyu3ka wrote on December 12, 2022 at 9:17 PM

RLHF is a bit tricky because you have to either work with data vendors or groups that have access to feedback data. Eventually we'll rely more on crowd sourcing I think.