Viewing a single comment thread. View all comments

singularpanda OP t1_j3eohh7 wrote

Yes, it is quite costy. However, it seems not easy to modify it in our research as it is not open.

1

KBM_KBM t1_j3g7swj wrote

https://github.com/lucidrains/PaLM-rlhf-pytorch

Similar to chat get architecture you can play with this

2

singularpanda OP t1_j3gdv9p wrote

Thanks! Yes, there are many similar things. But the ChatGPT seems to have the most amazing performance.

1

KBM_KBM t1_j3gere2 wrote

True but practically training a gpt model is not computationally cheap. I think instead of making such generalized language models we need to focus more one subject specific language models.

1

f_max t1_j3frhxs wrote

Megawatt sounds right for training. But kilowatts for inference. Take a look at tim dettmer’s work (he’s at UW) on int8 to see some of this kind of efficiency work. There’s definitely significant work happening in the open.

1