Viewing a single comment thread. View all comments

WarProfessional3278 t1_jaj9nnt wrote

Rough estimate: with one 400w gpu and $0.14/hr electricity, you are looking at ~0.00016/sec here. That's the price for running the GPU alone, not accounting server costs etc.

I'm not sure if there are any reliable estimate on FLOPS per token inference, though I will be happy to be proven wrong :)

3