singularpanda OP t1_j3e6asd wrote on January 7, 2023 at 10:52 PM

Yes. That's the benefit of in the big companies. However, for a lot of NLP researchers like me, we do not have that many gpu resources(I believe most of the companies also cannot afford this).

f_max t1_j3eagrm wrote on January 7, 2023 at 11:20 PM

Right. So if you’d rather not shoot to join a big company, there’s still work that can be done in academia with say a single A100. Might be a bit constrained at pushing the bleeding edge of capability. But there’s much to do to characterize LLMs. They’re black boxes we don’t understand in a bigger way than maybe any previous machine learning model.

Edit: there are also open source weights for gpt3 type models w similar performance. Ie huggingface BLOOM or Meta OPT.

singularpanda OP t1_j3elwu4 wrote on January 8, 2023 at 12:40 AM

Seems recently, not too much paper are doing on them. Don't look at details. Maybe models like OPT is still too large?

f_max t1_j3frqfb wrote on January 8, 2023 at 6:16 AM

They have a sequence of models ranging from 6B params up to 175B largest, so you can work on smaller variants if you don’t have gpus. There’s def some papers working on inference efficiency and benchmarking their failure modes if you look around.

Think_Olive_1000 t1_j3tnkld wrote on January 11, 2023 at 12:03 AM

Dude that's why you ought to put everything into NLP find a way of producing better results for cheaper on less expensive hardware and you'll be the talk of the town. I think everyone would love to have an unrestricted local version of chatgpt on their phones. Do the research!

currentscurrents t1_j3eo4uc wrote on January 8, 2023 at 12:56 AM

There's plenty of work to be done in researching language models that train more efficiently or run on smaller machines.

ChatGPT is great, but it needed 600GB of training data and megawatts of power. It must be possible to do better; the average human brain runs on 12W and has seen maybe a million words tops.

singularpanda OP t1_j3eohh7 wrote on January 8, 2023 at 12:59 AM

Yes, it is quite costy. However, it seems not easy to modify it in our research as it is not open.

KBM_KBM t1_j3g7swj wrote on January 8, 2023 at 9:30 AM

https://github.com/lucidrains/PaLM-rlhf-pytorch

Similar to chat get architecture you can play with this

singularpanda OP t1_j3gdv9p wrote on January 8, 2023 at 10:50 AM

Thanks! Yes, there are many similar things. But the ChatGPT seems to have the most amazing performance.

Think_Olive_1000 t1_j3tnqyd wrote on January 11, 2023 at 12:04 AM

I feel like you'd make a really bad research student

KBM_KBM t1_j3gere2 wrote on January 8, 2023 at 11:02 AM

True but practically training a gpt model is not computationally cheap. I think instead of making such generalized language models we need to focus more one subject specific language models.

f_max t1_j3frhxs wrote on January 8, 2023 at 6:14 AM

Megawatt sounds right for training. But kilowatts for inference. Take a look at tim dettmer’s work (he’s at UW) on int8 to see some of this kind of efficiency work. There’s definitely significant work happening in the open.

allaboutthatparklife t1_j3h8d9o wrote on January 8, 2023 at 3:46 PM

> Frankly LLMs will be the leading edge of the field for the next 2 years imo.

(curious outsider) what do you see being the leading edge after that? or will NLP be more or less solved by then?

f_max t1_j3hztd5 wrote on January 8, 2023 at 6:42 PM

Idk. Have a decent idea what’s being worked on for the next year but it gets fuzzy after that. Maybe we’ll have another architectural breakthrough. Alex net 2012, transformers 2017, something else 2023 or 2024 maybe.

[D] Will NLP Researchers Lose Our Jobs after ChatGPT?

f_max t1_j3e2s3m wrote on January 7, 2023 at 10:28 PM