caedin8 t1_j147bx3 wrote on December 21, 2022 at 3:31 PM

Is this just training? What about inferences? How does chatGPT serve millions of people so quickly if it needs such enterprise hardware per request

artsybashev t1_j154fhy wrote on December 21, 2022 at 7:04 PM

it is just the inference. Training requires more like 100 x A100 and a cluster to train on. Just a million to get started.

AltruisticNight8314 t1_j1ohh7u wrote on December 26, 2022 at 2:38 AM

What hardware would be required to i) train or ii) fine-tune weights (i.e. run a few epochs on my own data) for medium-sized transformers (500M-15B parameters)?

I do research on proteomics and I have a very specific problem where perhaps even fine-tuning the weights of a trained transformer (such as ESM-2) might be great.

Of course, there's always the poor man's alternative of building a supervised model on the embeddings returned by the encoder.

artsybashev t1_j1ph7f3 wrote on December 26, 2022 at 9:40 AM

one A100 80GB will get you started with models 500M-15B. You can rent that for a $50 per day. See where that takes you in a week.

AltruisticNight8314 t1_j1soeji wrote on December 27, 2022 at 2:17 AM

Thanks!

Misaiato t1_j14pagb wrote on December 21, 2022 at 5:27 PM

MSFT Azure. It has unlimited resources available to it.

gBoostedMachinations t1_j155zas wrote on December 21, 2022 at 7:14 PM

Training is what takes so much computation in almost all cases. Once the model itself is trained only a tiny fraction of the compute is needed. Most trained ML models that ship today can generate predictions on a raspberry pi or a cell phone. LLMs still require more hardware for inference, but you’d be surprised how little they need compared to what’s needed for training.

calv420 t1_j15ytb1 wrote on December 21, 2022 at 10:26 PM

Don't see why you're getting down voted, inference requires significantly less compute vs training.

gBoostedMachinations t1_j16pzea wrote on December 22, 2022 at 1:49 AM

If there’s on thing I’ve learned about Reddit, it’s that you can make the most uncontroversial comment of the year and still get downvoted. I mean, I got banned from r/coronavirus for pointing out that people who recover from covid probably have at least a little tiny bit of immunity to re-infection.

After covid, I’ve learned to completely ignore my comment scores when it comes to feedback on Reddit. The only way to know if one of my comments is valued is to read the replies.

CKtalon t1_j16qtog wrote on December 22, 2022 at 1:55 AM

Training will at minimum need about 10x more resources than what I said (inferencing). And that’s just to fit the model and all its optimisation weights with batch size 1.

visarga t1_j14bnb7 wrote on December 21, 2022 at 3:59 PM

GLM-130B runs on 4x 3090, uses INT4.

gBoostedMachinations t1_j155nsu wrote on December 21, 2022 at 7:12 PM

It’s kind of scary to think how soon the tech will enable randos to make LLMs. Sure, at first expertise will be needed but as we’ve seen before it’s only a matter of a brief period of time before the tools needed for the average Joe to train a model are made available.

Jfc shit is getting weird

mmeeh t1_j13urkr wrote on December 21, 2022 at 2:00 PM

256 GB :O

[deleted] t1_j15vcej wrote on December 21, 2022 at 10:02 PM

[removed]

[D] Running large language models on a home PC?

CKtalon t1_j13dg5b wrote on December 21, 2022 at 11:06 AM