Viewing a single comment thread. View all comments

yahma t1_j2ss1ox wrote

So with pruning and 8-bit quantization, are we able to run BLOOM-176B on a single GPU yet?

4

artsybashev t1_j2suada wrote

A100 can run about 75B parameters in 8bit. With pruning that is doable, but it wont be quite the same perplexity.

6

currentscurrents t1_j2trd40 wrote

If only it could run on a card that doesn't cost as much as a car.

I wonder if we will eventually hit a wall where more compute is required for further improvement, and we can only wait for GPU manufacturers. Similar to how they could never have created these language models in the 80s, no matter how clever their algorithms - they just didn't have enough compute power, memory, or the internet to use as a dataset.

5

artsybashev t1_j2v9lx2 wrote

If you believe in singularity, at some point we reach an infinite loop where "AI" creates better methods to run calculations that it uses to build better "AI". In a way that is already happening but once that loop gets faster and more autonomous it can find a balance where the development is "optimally" fast.

1