Top
Best
New

Posted by mpweiher 1 day ago

A guide to local coding models(www.aiforswes.com)
589 points | 348 commentspage 6
elestor 1 day ago|
yeah my 4GB of vram isn't gonna cut it
bearjaws 1 day ago||
I am sorry but anyone who actually has tried this knows it is horrifically slow, significantly slower than you just typing for any model worth its weight.

That 128gb of RAM is nice but the time to first token is so long on any context over 32k, and the results are not even close to a Codex or Sonnet.

dhruv3006 1 day ago||
r/locallama has very good discussion for this!
bjt12345 1 day ago|
/r/localllama is the spelling, I'm forever making this same mistake.
dhruv3006 12 hours ago||
ahaha
j45 1 day ago||
The work and interest in local coding models reminds me of the early 3D printer community, whatever is possible may take more than average tinkering until someone makes it a lot more possible.
m3kw9 1 day ago||
Nobody doing serious coding will use local models when frontier models are that much better, and no they are not half a gen behind frontier. More like 2 gen.
artursapek 1 day ago||
Imagine buying hardware that will be obsolete in 2 years instead of paying Anthropic $200 for $1000+ worth of tokens per month
selcuka 1 day ago|
> Imagine buying hardware that will be obsolete in 2 years

Unless the PC you buy is more than $4,800 (24 x $200) it is still a good deal. For reference, a MacBook M4 Max with 128GB of unified RAM is $4,699. You need a computer for development anyway, so the extra you pay for inference is more like $2-3K.

Besides, it will still run the same model(s) at the same speed after that period, or even maybe faster with future optimisations in inference.

hu3 1 day ago||
The value depreciation of the hardware alone is going to be significant. Probably enough to pay for 3x ~$20 subscriptions to OpenAI, Anthropic and Gemini.

Also, if you use the same mac to work, you can't reserve all 128GB for LLMs.

Not to mention a mac will never run SOTA models like Opus 4.5 or Gemini 3.0 which subscriptions gives you.

So unless you're ready to sacrifice quality and speed for privacy, it looks like a suboptimal arrangement to me.

dchftcs 1 day ago|||
I suspect depreciation will be a bit slower for a while, because there is a supply crunch.
artursapek 1 day ago|||
Yeah, didn't even mention the fact that you can't Opus on your own hardware. Total waste of cash.
h0rmelchilly 1 day ago||
[dead]
chrisischris 1 day ago||
[dead]
hmokiguess 1 day ago|
This seems really interesting. Reminds of IPFS but for AI
bubbi 1 day ago|
[dead]