Follow

Today I got a LLM running locally.

With acceleration and a four gigabyte modal, the response time is as good or better as I'd get from stealing all my data - around ten tokens / sec on a @frameworkcomputer AMD 13"

github.com/ggerganov/llama.cpp

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.