**Ted Underwood** @TedUnderwood@sigmoid.social · Mar 01, 2023, 23:39

**Ted Underwood** @TedUnderwood@sigmoid.social · Mar 01, 2023, 23:39

Ted Underwood @TedUnderwood@sigmoid.social

Mar 01, 2023, 23:39

Ted Underwood @TedUnderwood@sigmoid.social

I want someone to tell me how #OpenAI was able to drop API pricing by a *factor of 10* today. Is #ChatGPT smaller than we thought? Now that these models have actual applications, it should be just as big a deal when they get cheaper as it is when they get better.

**Boris Steipe** @boris_steipe@qoto.org · Mar 02, 2023, 00:35

**Boris Steipe** @boris_steipe@qoto.org · Mar 02, 2023, 00:35

Mar 02, 2023, 00:35

Boris Steipe @boris_steipe@qoto.org

@TedUnderwood

I was wondering about that myself, so I pulled out an envelope to scribble on.

It is thought that an instance of #ChatGPT runs on one Nvidia A-100 GPU and turns out on the order of 2 tokens per second. The new #OpenAI price of $ 0.002 per 1k-tokens would generate revenue of $ 126 per year. But the GPU draws about 250W of power, which is about 2,200 kwh over that year - let's say at 0.07/kwh that comes to $ 154. With these assumptions you could not even pay for energy, let alone depreciation of the GPU.

So let's try running the calculation backwards. If we assume an ROI of 30%, the cost of a single instance could be $10,000, a depreciation of 5 years and energy cost of $ 120 per year, the model would need to generate revenue of $ 2,800 per year and for that the instance would need to generate about 44 tokens per second. Which is about a factor of 20 more than we thought it could.

These are very crude estimates, and don't take significant economies of scale into account. But yes, in the end you'll need to put more models on a single hardware instance, use cheaper hardware, and speed up the generation. All of that.

Or just eat the loss.

🙂

**Elio Campitelli** @eliocamp@mastodon.social · Mar 02, 2023, 02:35

**Elio Campitelli** @eliocamp@mastodon.social · Mar 02, 2023, 02:35

Mar 02, 2023, 02:35

Elio Campitelli @eliocamp@mastodon.social

@boris_steipe @TedUnderwood Wow. I didn't realise that gpt was so big and slow.

**Boris Steipe** @boris_steipe@qoto.org · 2023-03-02T03:09:12Z

Boris Steipe @boris_steipe@qoto.org

@eliocamp @TedUnderwood@sigmoid.socia

I just re-checked the basic parameters of #ChatGPT generation cost hat I had noted down in December from a tweet by Tom Goldstein ...

https://twitter.com/tomgoldsteincs/status/1600196981955100694

He noted that you can rent an A100 instance for $3.00 per hour on the Azure cloud and he extrapolated a frequency of 0.35 s /per word per card and 8 cards per server. So that's about what I had estimated. Except that renting that to run it yourself would cost about $0.20 per 1k-tokens and OpenAI is now charging 1% of that.

🙂

Mar 02, 2023, 03:09 · · · ·

Trending now

Resources

Developers

What is Mastodon?

qoto.org

More…