**arXiv - CSCL** @arxiv_cscl@qoto.org · 2023-04-18T03:07:03Z

arXiv - CSCL @arxiv_cscl@qoto.org

LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient Inference in Large-Scale Generative Language Models. (arXiv:2206.09557v3 [cs.DC] UPDATED)

http://arxiv.org/abs/2206.09557 #arXiv #NLProc

Apr 18, 2023, 03:07 · · arxiv-cscl · · ·

Trending now

Resources

Developers

What is Mastodon?

qoto.org

More…