You must log in or register to comment.

DaffyDogModa t1_j2c6m43 wrote

-Only- needs 584 GPUs for 3 months to train haha.


mishap1 t1_j2cd6gg wrote

$6M in A100 GPUs plus all the hardware necessary to run them. Seems totally manageable.


Admirable_Royal_5119 t1_j2cj5co wrote

80k per year if you use aws


shogditontoast t1_j2cps4g wrote

Wow I’m surprised it’s so cheap. Now I regret working to reduce our AWS bill as that 80k would’ve previously gone unnoticed spread over a year.


username4kd t1_j2coq39 wrote

How would a Cerebras CS2 do?


SoylentRox t1_j2f9pw8 wrote

I think the issue is the cerebras has only 40 gigabytes of SRAM.

Palm is 540 billion parameters - that's 2.160 terabytes in just weights.

To train it you need more memory than that, think I read it's a factor of 3*. So you need 6 terabytes of memory.

This would be either ~75 A100 80 GB GPUs, or I dunno how you do it with a cerebras. Presumably you need 150 of them.

Sure it might train the whole model in hours though, cerebras has the advantage of being much faster.

Speed matters, once AI wars get really serious this might be worth every penny.


nickmaran t1_j2e09em wrote

Let me get some pocket change from my Swiss account


quettil t1_j2dn46n wrote

A fraction of the resources wasted by cryptocurrency.


PeakFuckingValue t1_j2ct76l wrote

How big is the storage requirement though? I don't know if I have an accurate perspective on what's beyond terabytes. That's like describing light years to me. Good luck.

It seems like we already unlocked some incredible speed technology recently with quantum computers. That was many magnitudes beyond the standard deviation. Whatever the cutting edge is on quantum computing and AI research must be combining the two.

Yes it's all crazy to us as consumers, but don't worry. We're in a capitalistic world. Whoever brings it to consumers first gets all the money lmao. They will be so stupid rich as well. I wonder if the people who should work with AI will be the ones who get there first.


rslarson147 t1_j2cz1uh wrote

Wonder if I could use a few at work without anyone noticing


DaffyDogModa t1_j2czs09 wrote

Worth a try!


rslarson147 t1_j2czuh6 wrote

Developing a new hardware stress test


DaffyDogModa t1_j2czz2y wrote

Just need some kind of remote on/off switch that you can turn on when everyone has gone home for the day. But I bet next power bill somebody gonna notice lol


rslarson147 t1_j2d06gs wrote

It’s just one GPU compute cluster, how much power could it consume? 60W?


Coindiggs t1_j2ekopz wrote

One A100 needs about 200-250w each. This needs 584x250w = 146,000w so approximately 146kwH. Average price of power is like 0.3$/kwh right now so running this will cost ya 43.8$ per hour, 1051.2$ per day or 32,500ish USD per month.


DaffyDogModa t1_j2d09dx wrote

One GPU if it’s a bad boy can be hundreds of watts I think. Maybe a miner can chime in to confirm.


rslarson147 t1_j2d0cqe wrote

I actually work as a hardware engineer supporting GPU compute clusters and have access to quite a few servers but I’m sure someone in upper management wouldn’t approve of this use


XTJ7 t1_j2fhhjd wrote

This went right over the head of most people. Brilliant comment though.


tomistruth t1_j2eig9t wrote

A medium crypto farm has about 1000 highend gpus running for full year. Server costs will go down, but we will hit a CPU performance plateau soon. Still compared to tech just 20 years ago, we now have computers more powerful than desktop pcs in our smartwatches. Also the ai model probably won't run on our phones but be connected to a giant central server system via the internet.

But what happens once we give those personal ais acceess to our computers and data is terrifying.

It could be the end of free speech and democracy, because you could literally become transparent. The AI could predict your habits and needs and show you ads before you even realize you want that.

Scary thought.


go_comatose_for_me t1_j2c6afc wrote

The article made it seem that running the AI at home would be stupid due to hardware needs, but not completely out of reach. The new software does seems to be very, very reasonable for a University or Company doing research into AI to build and run.


EternalNY1 t1_j2cd4hm wrote

They still estimate $87,000 per year on the low end to operate it yearly on AWS for 175 billion parameters.

I am assuming that is just the cost to train it though so it would be a "one time" cost every time you decided to train it.

Not exactly cheap, but something can can be budgeted for larger companies.

I asked it specifically how many GPUs it uses, and it replied with:

>For example, the largest version of GPT-3, called "GPT-3 175B," is trained on hundreds of GPUs and requires several dozen GPUs for inference.


aquamarine271 t1_j2ckpo1 wrote

That’s it? Companies pay like at least 100k a year on shitty business intelligence server space that is hardly ever used.


wskyindjar t1_j2cly2m wrote

seriously. Chump change for any company that could benefit from it in any way


aquamarine271 t1_j2cmsny wrote

This guy should put a deck together on the source of this 87k/yr and make it public if he wants every mid sized+ company to be sold on the idea


Tiny_Arugula_5648 t1_j2d910s wrote

It costs much less and trains in a fraction of the time when you can use a TPU instead of a GPU on Google Cloud.. that’s how Google trained the BERT & T5 models..


JigglyWiener t1_j2cdmju wrote

What a solid article. Well written and no hype. Just the facts.


vysken t1_j2db226 wrote

Probably written by AI.


reconrose t1_j2e3u0l wrote

Nah because it'd repeat the same vague, indeterminate bullshit 15 times. I have yet to see any expository text from chatGPT that didn't sound like a 14 yr old trying to hit a word limit. Except in those "examples" where they actually edit the output or go "all I had to do was re-generate the output 20 times giving it small adjustments each time and now I have this mediocre paragraph! Way simpler than learning how to write".


Garland_Key t1_j2e7zag wrote

Most adults have a grade school reading level, so that sounds about right. In my experience ChatGPT creates things that are good enough. My lane is software engineering, so I outsource my writing to AI.


misconfigbackspace t1_j2eoh3t wrote

The part that really made the article worth reading was this:

> Like ChatGPT, PaLM + RLHF is essentially a statistical tool to predict words. When fed an enormous number of examples from training data — e.g., posts from Reddit, news articles and e-books — PaLM + RLHF learns how likely words are to occur based on patterns like the semantic context of surrounding text.

So, even when you ask it to create a completely new fictional mishmash story about Darth Vader landing his Death Star in Aragorn to save Thor from being assimilated by the Borg, it will spew out sensible sounding sentences because it knows those references and what comes before and after those words (e.g. Darth Vader, Aragorn, Thor, Borg) and how to link the "before" and "after" words to stitch up a story by combining the same / common "before" and "after" words of the others.

It gives an impression of really understanding what it is saying in some sense, possessing mental models of some sort. But it does not. And that is why it will at most be the next replacement of web search - the truly smart assistant.

But it is nowhere close to real intelligence of any kind because it has no model of reality.

It is great and useful and will it make money and result in productivity and economy? Absolutely, it will change computing services dramatically.

Is it intelligence? Nope. Not even close.


almightySapling t1_j2f7a33 wrote

As long as AI continues to be trained on data from the internet, "average plus epsilon" is the best we can hope for.


Ensec t1_j2f4j5h wrote

It’s pretty good for explaining legal clauses


carazy81 t1_j2d3xjg wrote

$87k is a single persons wage. It’s absolutely worth running your own copy and training it with specific material. I jumped on this today and we’ll be running an implementation on azure with a team of two and as much hardware as reasonably required.

AI chat/assistance has been talked about for decades. ChatGpt is the first implementation I’ve used that I honestly think has “nailed it”.


alpacasb4llamas t1_j2fc1nn wrote

Gotta be able to find the right training material though and enough of it. I don't imagine many people have the resources or the ability to get that much raw data to get the model accurately trained.


carazy81 t1_j2fqs33 wrote

Yes, you’re right, but it depends on what you want it for. We have some specific applications, one of which is compliance checking. I suspect it will need a “base” of information to generate natural language and then a branch of data specific to the intended purpose. Honestly, I’m not sure, but either way, it’s worth investigating.


onyxengine t1_j2ciczh wrote

Its expensive, but it is feasible for an organization to raise the capital to deploy the resources. Its better than AI of this scale to be completely locked down as proprietary code.


extopico t1_j2bx06f wrote

This is a good article. Thank you for sharing it.


Vegetallica t1_j2djtwc wrote

Due to privacy reasons I haven't been able to play around with ChatGPT (OpenAI tracks IP addresses and requires a phone number to make an account). I would love to play around with one of these chat AIs when they can get the privacy thing sorted out.


serverpimp t1_j2d6ufu wrote

Can I borrow your AWS account?


popetorak t1_j2erzfp wrote

There's now an open source alternative to ChatGPT, but good luck running it

thats normal for open source


unua_nomo t1_j2e5rp7 wrote

I mean, honestly wouldn't be that hard to even crowd source training an open source model right?


misconfigbackspace t1_j2en6pf wrote


unua_nomo t1_j2enydh wrote

Crowdsource the funding, not the content the model is trained on


misconfigbackspace t1_j2erpp0 wrote

Funding one time's fairly easy. Getting a copy of that data is a little harder. That data will become stale in real time as the world moves forward, so that's the other big thing to keep in mind. I wonder what legal challenges will come up in the event the model copies stuff from litigious IP owners like Disney, the top music artists, Hollywood and the like.


unua_nomo t1_j2eyhnh wrote

I mean there are already open source datasets available, such as the Pile.

I can't see any argument for why a model derived on open source data would likewise not be open source, at which point if you could argue that a ML model could produce ip breaking content, that would be the responsibility of the individual producing and subsequently distributing that content.

As for data becoming stale, that wouldn't necessarily be an issue for plenty of applications, and even then there's no reason you couldn't just crowd fund 80k a year to train a newly updated model with newer content folded in.


syfari t1_j2fekeo wrote

Challenges are already popping up from artists over diffusion models. A lot of this has already been settled though as courts have determined model training to fall under fair use.


the_bear_paw t1_j2crbqu wrote

Genuine question as I'm confused: I tried chatgpt the other day and it is free to use and just required a log in, and I could use it on my phone... What benefit is there to an open source version when the original version is free?


kraybaybay t1_j2csych wrote

Original won't be free for long, and there are many reasons to train the model on a different set of data.


ImposterSyndrome53 t1_j2cx57m wrote

I haven’t followed incredibly closely, so might be wrong but chatgpt uses their gpt-3 model and there is only free, non-commercial access to the model. So no other companies are able to leverage it in a service. This would enable others to use it in commercial means and profit from it.

Edit: I haven’t looked actually but open source doesn’t mean “able to be used commercially with no limitations “ either. There might be stipulations even on this new derivative one.


11fingerfreak t1_j2ee9ef wrote

You can feed this one your own training materials. That means you can teach it to “speak” the way you want it to. Hypothetically, you could feed it every text you’ve ever composed and it would eventually generate text that sounds like you instead of a combination of every random person from the internet or whatever authors they “borrowed” content from.


the_bear_paw t1_j2ehjwn wrote

Cool thanks for clarifying, this makes more sense now. I was thinking about this only from the consumers perspective and generally, open source just means free to filthy casuals like me, so I didn't understand why anyone cared since chatgpt is currently free.

Also, after posting I thought about it and asked chatgpt hypothetically how would a German civilian with 100,000 net worth effectively go about assassinating Vladimir Putin without getting caught and it gave me a lame answer about not being used to assist violent political acts, which I found kinda dumb. So I assume feeding it different information and setting different parameters on what the thing can reply to would be helpful.


11fingerfreak t1_j2em9ck wrote

There’s some drawbacks that make it challenging for us plebs to use it, of course. The amount of hardware needed for training isn’t something we’re likely to have at hands. Renting it from AWS appears to be around $87k / year. Though I guess we could just feed it text and wait the couple of years for it to be trained 😬

Still gonna try it. I’m used to waiting for R to finish its work so…

This is a big benefit to any organization that has a reasonable budget for using Azure or AWS, though.

EDIT: we can probably still make use of it despite the hardware demands. It just means it will take us longer to train as non-corporate entities.


peolorat t1_j2dcmta wrote

More specialization would be a benefit.


Qss t1_j2e87o2 wrote

OpenAI likely won’t leave it free forever, not to mention ChatGPT is severely restricted in its application, very much so a walled garden.

There are other open source projects, one that comes to mind is Stability AI, that are rumored to be developing a model that will run natively on your phone hardware, no web access required.

Open source will also allow people to train these models on more specific data sets, maybe focused around coding or essay writing or social media posting in particular, instead of a one size fits all solution.

OpenSource will also mean the tech can evolve at a breakneck pace, as the stable diffusion Text to image generator has shown - giving a wide open toolset to the general public results in explosive growth in tech compared to giving them the front end UI only.

It also democratizes the information. AI will monumentally shift our social and economic landscape, and leaving that power in the hands of an “elite few” will only serve to widen power gulfs and classist demarcations.