Viewing a single comment thread. View all comments

harharveryfunny t1_jairuhd wrote

It says they've cut their costs by 90%, and are passing that saving onto the user. I'd have to guess that they are making money on this, not just treating it as a loss-leader for other more expensive models.

The way the API works is that you have to send the entire conversation each time, and the tokens you will be billed for include both those you send and the API's response (which you are likely to append to the conversation and send back to them, getting billed again and again as the conversation progresses). By the time you've hit the 4K token limit of this API, there will have been a bunch of back and forth - you'll have paid a lot more than 4K * 0.2c/1K for the conversation. It's easy to imagine chat-based API's becoming very widespread and the billable volume becoming huge. OpenAI are using Microsoft Azure compute, who may see a large spike in usage/profits out of this.

It'll be interesting to see how this pricing, and that of competitors evolves. Interesting to see also some of OpenAI's annual price plans outlined elsewhere such as $800K/yr for their 8K token limit "DV" model (DaVinci 4.0?), and $1.5M/yr for the 32K token limit "DV" model.

69

luckyj t1_jajaz53 wrote

But that (sending the whole or part of the conversation history) is exactly what we had to do with text-davinci if we wanted to give it some type of memory. It's the same thing with a different format, and 10% of the price... And having tested it, it's more like chatgpt (I'm sorry, I'm a language model type of replies), which I'm not very fond of. But the price... Hard to resist. I've just ported my bot to this new model and will play with it for a few days

24

currentscurrents t1_jajg818 wrote

> It says they've cut their costs by 90%

Honestly this seems very possible. The original GPT-3 made very inefficient use of its parameters, and since then people have come up with a lot of ways to optimize LLMs.

16

visarga t1_jaj4bqs wrote

> $1.5M/yr

The inference cost is probably 10% of that.

5

xGovernor t1_jaksopw wrote

Oh boy what I got away with. I have been using hundreds of thousands of tokens, augmenting parameters and only ever spent 20 bucks. I feel pretty lucky.

3

Im2bored17 t1_jam6y5y wrote

$20.00 / ($0.002/ 1k tokens) = 10m tokens. If you only used a few hundred k, you got scammed hard lol

8

xGovernor t1_jasx7r9 wrote

You needed the secret api key, included with the plus edition. Prior to Whispers I don't believe you could obtain a secret key. Also gave early access to new features and provides me turbo day one. Also I've used to much more and got turbo to work with my plus subscription.

Had to find a workaround. Don't feel scammed. Plus I've been having too much fun with it.

1