**Boris Steipe** @boris_steipe@qoto.org · 2023-02-21T23:52:20Z

Boris Steipe @boris_steipe@qoto.org

#ChatGPT on your desktop?

FlexGen paper by Ying Sheng et al. shows ways to bring hardware requirements of generative AI down to the scale of a commodity GPU.

https://github.com/FMInference/FlexGen/blob/main/docs/paper.pdf

Paper on GitHub - authors at Stanford / Berkeley / ETH / Yandex / HSE / Meta / CMU

They run OPT-175B (a GPT-3 equivalent trained by Meta) on a single Nvidia T4 GPU (~ $ 2,300) and achieve 1Token/s throughput (that's approximately 45 words per minute). Not cheap, but on the order of a high-end gaming rig.

Implications of personalized LLMs are - amazing.

#SentientSyllabus #ChatGPT #AI #GenerativeAI #HigherEd

Feb 21, 2023, 23:52 · · · ·

Trending now

Resources

Developers

What is Mastodon?

qoto.org

More…