AI Coding at Home Without Going Broke

Posted by sbochins 1 hour ago

AI Coding at Home Without Going Broke(stephen.bochinski.dev)

88 points | 85 comments

tunesmith 1 hour ago|

I feel like I must have plateued and don't know what to do next to level up. I'm currently on the $100/month codex plan and it seems fine using 5.5-xhigh all the time. I think of what to do next, have a chat session to determine exactly what to ask for up to the point of being ready to implement, and then codex churns on a commit-sized task whereupon I briefly check it on my local dev server. If necessary I ask for a change. Then I ask it to commit and recommend the next step based off the spec. Oftentimes I have to "approve" an out-of-sandbox request anyway.

I haven't found anything that requires running all night. I could tell it to one-shot a big plan but given how often I realize I want an intermediary thing to be slightly different it seems like a waste of effort.

I'm guessing the next thing I should probably look into is some sort of machine vm I can tunnel my codex-gui requests to so I don't have to deal with the sandbox approvals (I don't want to give it "dangerous" access to my entire mac).

I don't understand what people are doing with their side projects that is leading them to churn through tokens so quickly, to the point of requiring two $200/month subscriptions and a bunch of token charges besides.

tchock23 11 minutes ago||

Same boat here. I’m able to get a lot done on CC at $100/mo and feel like I’m not being creative or productive enough somehow when I hear of people blowing past that in a day.

PeterStuer 30 minutes ago|||

I'm on $100 Claude. I have a setup with bespoke local services that mitigates some high token consumption scenarios with local LAN services. I screen mcp's and hooks for cache poisoning. I run 100% on Opus with max effort, and never came close to hitting 5 hour or weekly limits before the Fable release. I am in Claude Code at least 20hrs a week.

I see people just completely wasting tokens with ridiculous setups, 100% hitting cache misses as well as dumping huge files into context all the time.

Just learn how these things work, or pay the price I guess.

dnautics 58 minutes ago|||

I have been on $100/mo claude and it has been churning out quite good software for months now. like i estimate what would have taken me three ish years, assuming i didn't burn out from failure (i would have). i only hit limits when i double fisted claude with my main project and my side project. just the other day i noticed i had been stuck on 4.5 because i failed to update the npm package.

sheremetyev 45 minutes ago|||

> I don't want to give it "dangerous" access to my entire mac

I'm running Claude/Codex inside native macOS sandbox, configured with a simple script - https://github.com/sheremetyev/sandfence

always in "bypass permissions" mode - it works until task is solved, sometime 1 hour or more (which includes running tests etc)

contingencies 31 minutes ago||

recommend converting to https://github.com/apple/container

sheremetyev 11 minutes ago||

Linux VM doesn't run native macOS toolchain and requires copying files back and forth

dheera 1 minute ago||

[dead]

pshirshov 1 minute ago||

> and the hardware you buy today may look like a bad bet in a year.

3090s and 7900s are going well so far.

Next year an Arc Pro B70 won't produce you less tokens than today.

bachmeier 18 minutes ago||

> The upfront cost is steep and the models you can actually run at home are weaker than what the frontier labs ship, so this only pays off if you can keep the rig busy with long running tasks where a slower, cheaper model grinds away overnight. Most people can’t keep a home machine that loaded, and the hardware you buy today may look like a bad bet in a year.

Oh, so this is not a post about AI coding at home. It's about vibe coding at home.

There's a lot I disagree with in this post, but I'm posting this from a home computer with 64 GB of RAM and no GPU. I do lots of AI coding while spending very little money. I run Gemma 4 26b (mixture of experts) and Qwen 3 coder with Ollama. I use Github Copilot code completions. I use the Gemini and Mistral API free tiers. I have a Gemini paid API account. It's now prepaid, so you don't have to worry about an accidental $1000 bill. You can do a lot of things with Gemini Flash Lite 3.1.

None of this is burning through tokens to create an expensive blob of spaghetti code, but it does qualify as AI coding.

isatty 1 hour ago||

> The first is to self host. You buy the machine, run open source models locally, and pay nothing per token after that.

Power is not free.

What I’ve found is that you’re basically paying a premium for privacy, and that’s worth it for me.

dofm 1 hour ago||

Luckily I needed a new laptop and I bought an M1 Max secondhand from a friend quite cheaply because it was fast enough to recompile something else I am interested in.

So for me, there is no additional hardware cost; it was acquired in replacement.

I run the AI models at home on this kit because I want to; I'll use openrouter if I need to.

I accept the economics of this article are right. But I feel so incredibly sad about this outcome that we're now just to be people caretaking machines that do the job we loved that actually I am not sure that exercising this nuance is going to matter in the long term.

It turns out it is a mistake I have made in my life — now really unfixable because I am a bit too old — to believe that I will always find enough fulfilment in my work to offset the absence of personal fulfilment elsewhere; I have always enjoyed being able to help people directly by doing a thing I love and I am good at, and that has kept away the sadness of finding it difficult to build a conventional family life to enjoy.

I assumed I would always find some new way to find that enjoyment, but even the slim enjoyment from being able to explore this stuff on my own kit in my own terms will not be enough if the pendulum does not swing back towards human effort.

It is a dismal world we have made for ourselves. Lately I have found myself dreading growing too much older in it.

throwaway219450 48 minutes ago|||

Also, I would anticipate at least a 5 year lifespan for a current generation card. The 3090 is still respectable simply because it has 24GB of RAM which, for years, has been the limiting factor for ML at home. If you got a 6000, sure it’s going to cost 7-8k, but the resale value is likely to be very good. Even the 3090 is 50%+ of RRP still. And if you’re not doing LLMs, it’s an interesting value proposition for “classic” CNN vision model training. You can fit enormous batch sizes on 96 GB. The biggest reason to upgrade is perf/watt has about doubled (eg 4000 pro Blackwell is half the 3090 for similar).

People tend to assume the capex is thrown away but as we’ve seen with RAM, don’t be so sure you won’t be able flip it if you need to.

warumdarum 1 hour ago|||

Actually if you have solar, it kind of is.. so prIvAt AI compute gets defacto cheaper during the day?

reactordev 1 hour ago||

If you have solar, it is not, because you have battery and equipment degradation from cycle charging, c’mon man…

I would agree with you if you said it was vastly cheaper overall (with the initial equipment investment amortized over time) compared to The Power Company.

In many states, even if you are generating electricity and selling it back to the power company, they still gonna charge you normal rates of usage because greed.

If you go off grid, you have bigger things to worry about than how to power your AI cluster. It’s manageable enough if you have land but that’s in scarce supply.

dnautics 54 minutes ago|||

> if you have solar, it is not, because you have battery and equipment degradation from cycle charging, c’mon man…

no, the rate of that is pretty independent of use. unless you live in a place where selling energy back rules are designed to screw the solar owner (California)

reactordev 51 minutes ago||

California, Arizona, Texas, most of the southern states…

iluvcommunism 54 minutes ago|||

[dead]

dnautics 56 minutes ago|||

> Power is not free.

its ~free if you have home solar.

enraged_camel 1 hour ago|||

>> Power is not free.

There's actually an interesting thought experiment here: if it takes you a full day to build something that AI would otherwise build in a day, do you end up using more power, or less? What is the break-even point, purely from a power consumption perspective?

dofm 1 hour ago|||

If an identical task takes a day on both sides, then the human route uses less energy, surely.

Brains are thousands or maybe even millions of times more fuel-efficient than computers and you are alive for the whole day either way, right? You probably eat about the same even.

The reason executives think AI is more efficient is that it more space efficient than a human and doesn't demand to be paid or work only a set number of hours. Everything with computing is more efficient if you resent having to give money to other humans. If they could just not have you be alive when they don't need you, it'd possibly be different.

Even though I think at a typical British freelance rate and a truly unsubsidised token price, the AI is possibly more expensive than me. And as a freelancer, from their perspective I really am not alive until they need me. (This is what it often feels like)

The reality is the human and the AI aren't used to build the same things anyway so it's a comparison you can't really make.

evrydayhustling 42 minutes ago||

Brains are efficient, but civilized humans aren't. In the USA, adults consume at a rate of about 10kW -- only 1-2% of that being the human's metabolism, the rest being HVAC, electrical devices, etc.

For comparison, a modern frontier model like Gemini 3.5 Pro consumes about 15kW -- so only about 1.5x the fully loaded human. In an 8h workday, that model would crank through ~80M tokens (~$5k at API prices). That's ~4 major refactors of a 10k LOC codebase, so probably not a very realistic comparison to a single human dev.

I think a more useful comparison, based on my experience, is that an engineer with AI support can get one 8h day's worth of unassisted work done in 1h. So, the 25 kWh consumed during collaboration (conservatively assuming I keep the GPU hot for the whole hour) frees up the remaining 70 kWh I'll draw down for the day to be spent in some other way.

antasvara 44 minutes ago||||

Studies on grandmaster chess players indicate that at most you burn 10% more calories when engaged in deep thought than when you're at rest. So the energy "attributable" to an hour of knowledge work is like 10 calories (average sedentary calorie burn is like 80-100 per hour; add a max of 10% for the thinking gets you 8-10 calories). A pound of potatoes is like a buck and is about 320 calories. So you're looking at like 3 cents an hour at most to cover that energy burn. It's definitely even less; I certainly don't think as hard as a grandmaster chess player.

Then, assume power costs 20 cents per kilowatt hour (US avwrage) To match the human 3 cents per hour, you need an average of 150 watts of power drawn per hour. That's in the range of a budget graphics card, but not much past there.

However, if you sleep instead of sitting around, you can probably make AI cost competitive. Sleeping drops your metabolic rate by more, and lying down in bed (as opposed to sitting) also reduces calorie burn. Combined, you can reduce your burn by like 30 calories an hour. At the new 9 cents per hour human cost, you can afford to run a higher end graphics card at ~450 watts per hour. That puts you in RTX 3090 range.

axus 1 hour ago||||

What would you do for the rest of the day, power off your devices and go for a long bike ride?

enraged_camel 1 hour ago||

Speaking personally: yes. That's literally what I'm planning to do this afternoon because it's noon and I'm already done with the coding tasks I had on my plate today.

dofm 58 minutes ago||

Luckily the future is absolutely going to be that star trek one where technological abundance means we are all wealthy and have free time to develop personally, and not the future where all the money bubbles up into the hands of a thin-skinned malignant narcissist who wants to play with launching rockets and provoking racial violence /s

Yoric 46 minutes ago|||

I'm assuming that you need to feed the human being (i.e. you) regardless of whether you use that human being for writing code or not. So, by this metric, there is simply no breaking even point. The cost of human + AI is always going to be higher than the cost of human.

jrm4 53 minutes ago|||

I'm in Florida and am already using AC, so if not "free", definitely "negligible."

rambojohnson 58 minutes ago|||

work at a cafe.

mxmxnxnsndndndj 1 hour ago||

[dead]

hillj23 19 minutes ago||

I think this is only going to become more relevant. I'm personally a $200/mo Claude Maxer and I know that the usage I'm getting on Opus 4.8 Max and (until they yoked it out from under me) Fable 5 is way, way more than what I'm paying them. At some point, this will turn usage-based and I will be hammered on it and probably forced to look at self-hosting. I think while the caps are there, even at $200, it's honestly not too bad if you're coding value into the market, but as soon as those caps come off for retail AI users, we're all going to have some tough choices to make.

mwcampbell 1 hour ago||

I invested about $4,000 in an NVIDIA DGX Spark several months ago. 128 GB of unified RAM, and the NVIDIA GB10 chip. With the RAM, the several CPU cores, and the 4 TB NVMe SSD, it's a very capable ARM64 Linux computer even without the GPU, and so far I've mostly been using it as such. But I wonder, what's the most capable model, specifically for coding, that can run well on that hardware?

Yoric 46 minutes ago||

https://www.canirun.ai/?status=tight might answer that question

morganastra 40 minutes ago||

Deepseek v4 flash is shockingly strong for its size and reportedly runs well on that hardware.

atreids 1 hour ago||

I find just going via Deepseek's platform API directly, using their V4 flash model, and hooking into a harness like Opencode more than acceptable. Think I've spent maybe $10 over a couple of weeks.

I did explore self-hosting models but hardware right now is just too expensive.

Yoric 45 minutes ago|

Directly at DeepSeek? It was my understanding (but I didn't check) that some other AI operators were providing (some of?) DeepSeek's model for cheaper prices.

Still, that's interesting. What do you get for that price? Only coding, or also e.g. image generation?

esalman 1 hour ago||

For me, investing in hardware seems to be the way to go.

I learned coding nearly 24 years ago and still learning new stuff all the time. At no point in time I had to rely on a subscription model to learn and do new stuff.

If LLM and agents are the default tools for coding and building software, at least for next few years, it seems like a no-brainer to invest $2000-3000 on hardware, like a Halo Strix PC.

CraigJPerry 1 hour ago||

I wondered if there might be a no brainer "free" option on discarded hardware.

I have a GTX1080ti which i think is circa 2018, it's unused, more than paid for itself over the years, owes me nothing at this point so the hardware is free.

It runs Gemma e4b multimodal, qwen 3.5 8b or the qwen 4b embeddings models well enough (40+ t/s for the LLMs).

The machine consumes 350 watts at the wall when under load (3 watts when sleeping, 80w at idle). Electricity costs me £0.035GBP/kwh which is cheap for the UK (load shifting via house battery).

144k output tokens for around 1pence (and takes an hour to do that in theory).

It's only JUST cheaper to use than the far more capable deepseek v4 flash model despite the free hardware and ~10x cheaper than normal electricity.

iugtmkbdfil834 1 hour ago|||

Yes and no. Hardware does lock you in. Granted, I am happy with my 128gb of shared memory, but I am mildly concerned that it actually is more expensive now than when I bought mine. It does not bode well for the future; not when combined with recent WH admin moves on Anthropic and the reality that next batch of good models may require more than 128gb to run well.

edit: I am not dismissing local. I am one such user ( though I have subs too ), but one has to be clear eyed about the trade-offs.

hgoel 55 minutes ago|||

$3k isn't getting you frontier model capability. It's barely getting you any capability if that's split into buying an entire PC rather than just GPUs.

jrm4 51 minutes ago|||

With you here. I'm using my cheapo 16gig vram card I picked up a year or so ago, and I'm like -- yes, I percieve that you can pay for way more tokens per second that I can do at home.

But that feels like measuring productivity in lines of code. For what I'm doing, I'm not seeing the benefit in any subscription.

Sure, I can't one-prompt a whole new boring CRUD app, but oh well.

throwatdem12311 1 hour ago||

3k? Try 10

vadansky 1 hour ago||

Can I run something comparable to Opus 4.6 locally yet? I keep hearing conflicting things. If I can spend 10k to do that I would cancel my subscription. The problem is I don’t wanna spend the money to find out myself.

Catloafdev 1 hour ago||

If you want frontier-level, the economically reasonable option is OpenRouter or a direct sub to frontier-of-your-choice.

The reality is that they do not offer configurations that would allow a consumer to run that much VRAM on a single setup to protect datacenter margins. Apple used to, and they stopped, those devices are going for ~$20k+ each on ebay now.

You can get very, very capable models on a 3090/4090/5090/6000 series card. But if you want 'frontier level' you are investing ~22k at a bare minimum if you go new. Used you can probably build your own server for much cheaper up-front cost but it's likely going to be 4-6x+ electricity usage.

daemonologist 1 hour ago|||

There are also significant economies of scale (namely: utilization and batching), which tend to make inference on a shared server more economical even after the operator takes a cut.

theossuary 58 minutes ago|||

I truly think by 2028 we'll have integrated chip systems that'll be able to run opus 4.8 level models at ~500 watts at acceptable performance. Honestly I think now is the worst time to invest in AI hardware. Get your harness ready and processes perfected with hosted models, and wait a few years to buy hardware to transition to running models locally

baq 54 minutes ago|||

Burning weights onto a chip in an efficient way and exposing that via USB would be acceptable for a good enough model tbh

ajbourg 41 minutes ago||

This is pretty close to what Taalas is doing.

hurtigioll 28 minutes ago||||

if such hardware becomes available, it will be bought by the data-centers, just like they buy all the RAM today

CamperBob2 48 minutes ago|||

Honestly I think now is the worst time to invest in AI hardware.

That position is not without its own risks, though. Maybe Opus 4.8 will run on a single chip by 2028... and maybe you won't be allowed to touch it.

And what if Xi makes a play for Taiwan? That would be stupid, but so was invading Ukraine with tanks from Temu, and it still happened.

grim_io 1 hour ago|||

10k will not get you anywhere near opus or sonnet. It's simply not possible for mere mortals currently.

als0 1 hour ago|||

> Can I run something comparable to Opus 4.6 locally yet?

Sadly, no. The best comparable thing you can get is about Sonnet 3.7

captaintobs 1 hour ago|||

i spent 8k and get close to a 2-3x slower sonnet. running 2x spark deep seek v4 flash

CamperBob2 54 minutes ago|||

Some benchmarks have shown Kimi K2.6 within error-bar distance of Opus 4.6, and you can run it on eight RTX6000s. Right now it's not possible to set up a machine like that from scratch for less than $100K... but right now it's also hard to put a price on autonomy.

atemerev 53 minutes ago||

Best you could do is connect two Mac Studio M3 Ultra 512G RAM each with Thunderbolt. Then theoretically you can run frontier Chinese models (but not Deepseek v4 Pro yet). That would be about $20k.

But - good luck finding them. Apple discontinued the model a few months ago. And more recently, even 256G model was discontinued. Big AI really really does not want people to get off their needle.

RomanPushkin 1 hour ago|

AI coding at home literally costs $100/month. I'm wondering where $400 is coming from? $100 is more than enough for "coding at home", IMO. I rarely face the limits, and when I do it's just a time for a quick walk anyway.

More comments...