llamacpp finally can run llama2 on GPU (using cublas).

It's still not super fast (due to the limited vram), but it's better than pure cpu.

I also noticed a header file in the repo, which means I can make a Java binding (maybe? I didn't find any .dll or .so file during the build)

And to my expectations, my razer laptop is now struggling to keep both cpu and gpu cool at the same time...

@skyblond one thing i hate about laptops is they tend to slow down over time since dust is harder to clean and there is a smaller space for it to accumulate in.

Not sure if that is relevant here though, just my first thought.

@freemo Another thing I hate about laptops is that the replacement is hard to find. After certain times, even the customer support doesn't have the replacement to buy, not mentioning some vendors don't even sell the parts, you have to mail the whole laptop to them to just replace a fan.

Anyway, I think the major issue is my laptop doesn't have enough space to handle 150W of heat. The laptop is slim, light, and powerful, but the heat is the pill to swallow.

@skyblond Honestly if a part breaks in a laptop I dont even bother trying to fix it most of the time.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.