Viewing a single comment thread. View all comments

starstruckmon t1_j3wvdqt wrote

>I think you may be underestimating the compute cost. It’s about $6M of compute (A100 servers) to train a GPT-3 level model from scratch. So with a billion dollars, that’s about 166 models.

I was actually overestimating the cost to train. I honestly don't see how these numbers don't further demonstrate my point. Even if it cost a whole billion ( that's a lot of experimental models ), that's still 10 times less than what they're paying.

>Considering experimentation, scaling upgrades, etc., that money will go quickly. Additionally, the cost to host the model to perform inference at scale is also very expensive. So it may be the case that the $10B investment isn’t all cash, but maybe partially paid in Azure compute credits. Considering they are already running on Azure.

I actually expect every last penny to go into the company. They definitely aren't buying anyone's shares ( other than maybe a partial amount of employee's vested shares ; this is not the bulk ). It's mostly for new shares created. But $10B for ~50% still gives you a pre-money valuation of ~10B. That's a lot.

1

Non-jabroni_redditor t1_j3yr9cx wrote

Time. The answer is time and risk for why they are spending 10x.

They can spend the next however many years attempting to build a model that is like gpt but is entirely possible it’s just not as good after all of that. The other option is pay a premium with money they have for a known product.

1