cfoster0

cfoster0 t1_izlys6v wrote

About this bit

> At the moment, TRLX has an API capable of production-ready RLHF at the scales required for LLM deployment (e.g. 33 billion parameters). Future versions of TRLX will allow for language models up to 200B parameters. As such, interfacing with TRLX is optimized for machine learning engineers with experience at this scale.

Has TRLX been used to tune models in production already? Or if not, what did the blog post mean by "capable of production-ready RLHF"? I haven't seen any RLHF-ed models built on open source software yet, much less a 33B parameter one.

EDIT: Also hi @FerretDude

9