Submitted by **Angry_Grandpa_** t3_y92cl1
in **singularity**

If we want to build an AI model based on everything ever said on YouTube then according to my calculations a model with 770 billion parameters and 15.7 trillion tokens would be sufficient.

This assumes all of the audio is converted to text and an average speech speed of 100 words per minute. 500 hours of content is uploaded to YouTube every minute. It’s probably larger than what we’d need since I assumed 10 years at 500 hours and YouTube didn’t hit 500 hours per minute until 2019.

1,000 words is approximately 750 tokens. So it’s possible the token number needs to be a little higher (19.6 trillion).

This is based on the optimized parameter to token ratio in the Chincilla model.

Source: https://arxiv.org/abs/2203.15556

Based on the public pricing on Mosaic cloud the current cost is $2.5 million for a 1.4 trillion token model. So the worst case scenario would be a mere **$35 million** for a YouTube large language model.

Source: https://www.mosaicml.com/blog/gpt-3-quality-for-500k

Is it worth it?

Has Google already done it?

manOnPavementWavingt1_it393xz wrotemy man you cant just scale cost with number of tokens and not number of parameters

way too many mostly false assumptions in these calculations