Follow

@wc_ratcliff@ecoevo.social

Here's a recent very rough guess I made:

OpenAI's GPT-4 paper (aka press release) revealed nothing about their model so you have to make a Fermi estimate.

They did publish info on GPT-3 training - 1287 MWh to train GPT-3 on 300B tokens. If you assume 10x for GPT-4 and inference costs about half training (no backprop) you get about 3e-3 KWh/1000 tokens. That's probably an upper bound.

arxiv.org/ftp/arxiv/papers/210
arxiv.org/pdf/2005.14165.pdf

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.