Viewing a single comment thread. View all comments

currentscurrents t1_j2cm36p wrote

TL;DR they want to take another language model (Google’s PaLM) and do Reinforcement Learning with Human Feedback (RLHF) on it like OpenAI did for ChatGPT.

At this point they haven't actually done it yet, since they need both compute power and human volunteers to do the training:

>Human volunteers will be employed to rank those responses from best to worst, using the rankings to create a reward model that takes the original model’s responses and sorts them in order of preference, filtering for the top answers to a given prompt.

>However, the process of aligning this model with what users want to accomplish with ChatGPT is both costly and time-consuming, as PaLM has a massive 540 billion parameters. Note that the cost of developing a text-generating model with only 1.5 billion parameters can reach up to $1.6 million.

Since it has 540b parameters, you will still need a GPU cluster to run it.

81

Ok_Reference_7489 t1_j2e73fe wrote

>At this point they haven't actually done it yet

There is no "they" there. This is just some random crypto guy's blog who clearly does not know what he is talking about.

34

currentscurrents t1_j2ef37r wrote

Right, he's not the developer - it's just an article about the project.

8

Ok_Reference_7489 t1_j2eg79x wrote

There is no project.

−1

FruityWelsh t1_j2covdi wrote

it'll be interesting if something like petal.ml can help with this. The human reinforcement and getting gpu processing parts that is.

17

lucidrage t1_j2e7pgv wrote

Just Blockchain it and use the rewards tokens for api consumption

−10