Running llama.cpp locally with Qwen3.6 35B A3B and systemd
A practical guide to installing and configuring llama.cpp to serve the Qwen3.6 35B A3B mixture-of-experts model locally, including tuning runtime parameters and setting up a systemd service so the inference server starts automatically on boot.
· 14 min read · 17 min listen
local LLMself-hostinglinuxinferenceintermediate