Posted by cloudking 20 hours ago
Ask HN: Has anyone replaced Claude/GPT with a local model for daily coding?
The secret to actually good agentic outputs even with small models? Llamacpp has support for this little known sampler called "top-n sigma". You should use that, set it to 1 and set temperature to literally whatever you want (it could be infinity) and your model will just magically work to your maximum context window. That's because long context generation is a sampling problem.
Albeit I plan to move to local ones when I will get my hands on a 256+ GB macbook.
Local inference is good enough to help me with my daily job, and doesn't turn me into an assistant to the LLM.
If I give it a page of context, can it write a linked list or identify a bad line of CSS?
Is there anywhere online I can chat with a model I could be running at home to see how good it is?
67M Ouput 51M Input
Total $0.83 dollar.
I honestly don't understand why people just don't use DeepSeek.
Recommended setup: plenty of nutrients, some caffeine and a quiet environment.
Performance - not currently measured in tokens: roughly average.