Posted by cloudking 17 hours ago
Ask HN: Has anyone replaced Claude/GPT with a local model for daily coding?
It's faster than I can read, but it feels slow as hell. I think 40-50 tks is probably much more comfortable and I hope I can reach that when trying this on llamacpp soon enough.
[0] - https://pastes.io/9gaARxE8
[1] - https://jsfiddle.net/pou4nbh9/1/
Model: https://huggingface.co/google/gemma-4-26B-A4B-it-qat-q4_0-gg...
I think it's so good that I now scour the local marketplaces for good buys on 24GB cards that don't seem run through by miners and the likes, to build an even bigger rig for parallel execution.
Power usage is also totally not an issue, AI workload is very different from gaming.
tldr llama.cpp-vulkan with opencode on total 48GB VRAM AMD cards on arch btw.
Sure, you can get the local models to generate plausibly-looking code for simple cases. But compared to how I solve complex design problems in a large codebase with Claude Code and Opus/Fable, this isn't worth my time.
I'm still optimizing it (with claude, to be clear), but my testing is very encouraging. I worry a lot about companies (and the government) controlling access to machine intelligence, so local is the way to go.
The only reason it’s economical is because it’s massively discounted if you’re not paying API rates.