Viewing a single comment thread. View all comments

ElvinRath t1_j2o33lj wrote

Sure, there is a tradeoff but I think that for fp16 it isn't that terrible.

For fp8 I just don't know. There is people working with int8 to fit 20B parameters in 3090/4090, but I have no idea of at what price... Just wanted to say that the posibility does exist.

I remember reading about fitting big models in low precision but it was focused in performance/memory usage, but it showed that it was a very useful technique...

​

Anyway I can't find it now, but I found this while looking for it, haha:

https://twitter.com/thukeg/status/1579449491316674560

They claim almost no degratation with int4 & 130B parameters.

​

No idea how this could apply to bigger ones, or even about the validity of the claim, but it does sound well. We would be fitting 40B parameters in a 3090 / 4090...

​

Anyway I think that fp8 might not be out of question at all, but we will see :P

​

I know that you say "chatGPT is like the Wright Brothers. Nobody is going to settle for an AI that can't even see or control a robot. So it's only going to get heavier in weights and more computationally expensive"

And...Sure, no one is going to settle for less. But consumer hardware is very far behind and people is going to try and work with what they have, for now.

And there is some interest for it. You have NovelAI, DungeonAI and KoboldAI, and people plays with them, when frankly, they work quite poorly.

I hope that with the release of good open sourced LLM with RHLF (I'm looking at you, CarperAI and StabilityAI) & this kind of techniques we start to see this tech becoming more comonplace, maybe even used in some indie games, to start pushing for more VRAM on consumer hardware. (Because if there is a need there is a way. Vram is not that expensive anyway given the prices of GPUs nowadays...)

2

SoylentRox t1_j2oeflk wrote

>And...Sure, no one is going to settle for less. But consumer hardware is very far behind and people is going to try and work with what they have, for now.

No they won't. They are just going to rent access to the proper hardware. It's not that expensive.

1