Posted by svara 7 hours ago
Ask HN: How is AI-assisted coding going for you professionally?
If you've recently used AI tools for professional coding work, tell us about it.
What tools did you use? What worked well and why? What challenges did you hit, and how (if at all) did you solve them?
Please share enough context (stack, project type, team size, experience level) for others to learn from your experience.
The goal is to build a grounded picture of where AI-assisted development actually stands in March 2026, without the hot air.
It has also enabled a few people to write code or plan out implementation details who haven't done so in a long (sometimes decade or more) time, and so I'm getting some bizarre suggestions.
Otherwise, it really does depend on what kind of code. I hand write prod code, and the only thing that AI can do is review it and point out bugs to me. But for other things, like a throwaway script to generate a bunch of data for load testing? Sure, why not.
Or I'll walk up to your desk and ask you to explain it.
It’s the asymmetric expectations—that one person can spew slop but the other must go full-effort—that for me personally feels disrespectful.
Last year I was working on implementing a pretty big feature in our codebase, it required a lot of focus to get the business logic right and at the same time you had be very creative to make this feasible to run without hogging to much resources.
When I was nearly done and worked on catching bugs, team members grew tired of waiting and starting taking my code from x weeks ago (I have no idea why), feeding it to Claude or whatever and then came back with a solution. So instead of me finishing my code I had to go through their version of my code.
Each one of the proposals had one or more business requirements wrong and several huge bugs. Not one was any closer to a solution than mine was.
I had appreciated any contribution to my code, but thinking that it would be so easy to just take my code and finishing it by asking Claude was rather insulting.
We're in a phase where founders are obsessed with productivity so everything seens to work just fine and as intended with few slops.
They're racing to be as productive as possible so we can get who knows where.
There are times when I honestly don't even know why we're automating certain tasks anymore.
In the past, we had the option of saying we didn't know something, especially when it was an area we didn't want to know about. Today, we no longer have that option, because knowledge is just a prompt away. So you end up doing front-end work for a backend application you just built, even though your role was supposed to be completely different.
Something resembling knowledge anyway. A sort of shambling mound wearing knowledge like a skinsuit
I'm running Codex on a Raspberry Pi, and Claude Code CLI, Gemini CLI, and Claude in Chrome all on a Mac, all touching the same project across both machines. The drift is constant. One agent commits, the others don't know about it, and now you've got diverged realities. I'm not a coder so I can't just eyeball a diff and know which version is right.
Ended up building a mechanical state file that sits outside all the context windows. Every commit, every test run, every failed patch writes to it. When a new session starts, the agent reads that file first instead of trusting its own memory. Boring ops stuff really, but it's the only thing that actually stopped the "which version is real" problem.
Most of my gripes are with the harness, CC is way better.
In terms of productivity I'm def 2-4X more productive at work, >10x more productive on my side business. I used to work overtime to deliver my features. Now I work 9-5 and am job hunting on the side while delivering relatively more features.
I think a lot of people are missing that AI is not just good for writing code. It's good for data analysis and all sorts of other tasks like debugging and deploying. I regularly use it to manage deployment loops (ex. make a code change and then deploy the changes to gamma and verify they work by making a sample request and verifying output from cloudwatch logs etc). I have built features in 2 weeks that would take me a month just because I'd have to learn some nitty technical details that I'd never use again in my life.
For data analysis I have an internal glue catalog, I can just tell it to query data and write a script that analyzes X for me.
AI and agents particularly have been a huge boon for me. I'm really scared about automation but also it doesn't make sense to me that SWE would be automated first before other careers since SWE itself is necessary to automate others. I think there are some fundamental limitations on LLMs (without understanding the details too much), but whatever level of intelligence we've currently unlocked is fundamentally going to change the world and is already changing how SWE looks.
In the bucket of "really great things I love about AI", that would definitely be at the top. So often in my software engineering career I'd have to spend tons of time learning and understanding some new technology, some new language, some esoteric library, some cobbled-together build harness, etc., and I always found it pretty discouraging when I knew that I'd never have reason to use that tech outside the particular codebase I was working on at that time. And far from being rare, I found that working in a fairly large company that that was a pretty frequent occurrence. E.g. I'd look at a design doc or feature request and think to myself "oh, that's pretty easy and straightforward", only to go into the codebase and see the original developer/team decided on some extremely niche transaction handling library or whatever (or worse, homegrown with no tests...), and trying to figure out that esoteric tech turned into 85% of the actual work. AI doesn't reduce that to 0, but I've found it has been a huge boon to understanding new tech and especially for getting my dev environment and build set up well, much faster than I could do manually.
Of course, AI makes it a lot easier to generate exponentially more poorly architected slop, so not sure if in a year or two from now I'll just be ever more dependent on AI explaining to me the mountains of AI slop created in the first place.
Sanctioned comment?
Pretty sure the answer is here :)
This is a key candidates to use AI as we have built hundreds of warehouses in the past. We have a standard product that spans over a hundred thousand lines of code to build upon. Still, we rely on copying code from previous projects if features have been implemented before. We have stopped investing in the product to migrate everything to microservices, for some reason, so this code copying is increasingly common as projects keep getting more complex.
Teams to implement warehouses are generally around eight developers. We are given a design spec to implement, which usually spans a few hundred pages.
AI has over doubled the speed at which I can write backend code. We've done the same task so many times before with previous warehouses, that we have a gold mine of patterns that AI can pick up on if we have a folder of previous projects that it can read. I also feel that the code I write is higher quality, though I have to think more about the design as previously I would realize something wouldn't work whilst writing the code. At GWT though, it's hopeless as there's almost no public GWT projects to train an AI on. It's also very helpful in tracing logs and debugging.
We use Cursor. I was able to use $1,300 tokens worth of Claude Opus 4.6 for a cost of $100 to the company. Sadly, Cursor discontinued it's legacy pricing model due to it being unsustainable, so only the non-frontier models are priced low enough to consistently use. I'm not sure what I'm going to do when this new pricing model takes affect tomorrow, I guess I will have to go back to writing code by hand or figure out how to use models like Gemini 3.1. GPT models also write decent code, but they are always so paranoid and strictly follow prompts to their own detriment. Gemini just feels unstable and inconsistent, though it does write higher quality code.
I'm not being paid any more for doubling my output, so it's not the end of the world if I have to go back to writing code by hand.
At work, the devs up the chain now do everything with AI – not just coding – then task me with cleaning it up. It is painful and time consuming, the code base is a mess. In one case I had to merge a feature from one team into the main code base, but the feature was AI coded so it did not obey the API design of the main project. It also included a ton of stuff you don’t need in the first pass - a ton of error checking and hand-rolled parsing, etc, that I had to spend over a week unrolling so that I could trim it down and redesign it to work in the main codebase. It was a slog, and it also made me look bad because it took me forever compared to the team who originally churned it out almost instantly. AI tools are not good at this kind of design deconflicting task, so while it’s easy to get the initial concept out the gate almost instantly, you can’t just magically fit it into the bigger codebase without facing the technical debt you’ve generated.
In my personal projects, I get to experience a bit of the fun I think others are having. You can very quickly build out new features, explore new ideas, etc. You have to be thoughtful about the design because the codebase can get messy and hard to build on. Often I design the APIs and then have Claude critique them and implement them.
I think the future is bleak for people in my spot professionally – not junior, but also not leading the team. I think the middle will be hollowed out and replaced with principals who set direction, coordinate, and execute. A privileged few will be hired and developed to become leaders eventually (or strike gold with their own projects), but everyone in between is in trouble.
People who disagree at all levels of seniority have been made to leave the organization.
Practically speaking, there's no sexy pitch you can make about doing quality grunt work. I've made that mistake virtually every time I've joined a company: I make performance improvements, I stabilize CI, I improve code readability, remove compiler warnings, you name it: but if you're not shipping features, if you're not driving the income needle, you have a much more difficult time framing your value to a non-engineering audience, who ultimately sign the paychecks.
Obviously this varies wildly by organization, but it's been true everywhere I've worked to varying degrees. Some companies (and bosses) are more self-aware than others, which can help for framing the conversation (and retaining one's sanity), but at the end of the day if I'm making a stand about how bad AI quality is, but my AI-using coworker has shipped six medium sized features, I'm not winning that argument.
It doesn't help that I think non-engineers view code quality as a technical boogeyman and an internal issue to their engineering divisions. Our technical leadership's attitude towards our incidents has been "just write better code," which... Well. I don't need to explain the ridiculousness of that statement in this forum, but it undermines most people's criticism of AI. Sure, it writes crap code and misses business requirements; but in the eyes of my product team? That's just dealing with engineers in general. It's not like they can tell the difference.
It's best to sniff out values mismatches ASAP and then decide whether you can tolerate some discomfort to achieve your personal goals.
You’re much better off mixing both (quality work and product features).
Yes? In the same way any victim of shoddy practices is "part of the problem"?
I know a lot of people who tried playing this game frequently during COVID, then found themselves stuck in a bad place when the 0% money ran out and companies weren’t eager in hiring someone whose resume had a dozen jobs in the past 6 years.
I hope you get the privilege soon
You can should speak up when tasks are poorly defined, underestimated, or miscommunicated.
Try to flat out “refuse” assigned work and you’ll be swept away in the next round of layoffs, replaced by someone who knows how to communicate and behave diplomatically.
They clearly were not advocating for flat out refusing.
It's just plain unprofessional to just YOLO shit with AI and force actual humans to read to code even if the "author" hasn't read it.
Also API design etc. should be automatically checked by tooling and CI builds, and thus PR merges, should be denied until the checks pass.
If they're handing you broken code call them out on it. Say this doesn't do what it says it does, did you want me to create a story for redoing all this work?
The hell you are playing hero for? Delegate the choice to manager: ruin the codebase or allocate two weeks for clean-up - their choice. If the magical AI team claim they can do integration faster - let them.
IME too many don't care about on call unless they are personally affected.
This has to be the most thankless job for the near future. It's hard and you get about as much credit as the worker who cleans up the job site after the contractors are done, even though you're actually fixing structural defects.
And god forbid you introduce a regression bug cleaning up some horrible redundant spaghetti code.
I know my mind fairly well, and I know my style of laziness will result in atrophying skills. Better not to risk it.
One of my co-workers already admitted as much to me around six months ago, and that he was trying not to use AI for any code generation anymore, but it was really difficult to stop because it was so easy to reach for. Sounded kind of like a drug addiction to me. And I had the impression he only felt comfortable admitting it to me because I don't make it a secret that I don't use it.
Another co-worker did stop using it to generate code because (if I'm remembering right) he can tell what it generates is messy for long-term maintenance, even if it does work and even though he's new to React. He still uses it often for asking questions.
A third (this one a junior) seemed to get dumber over the past year, opening merge request that didn't solve the problem. In a couple of these cases my manager mentioned either seeing him use AI while they were pairing (and it looked good enough so the problems just slipped by) or saw hints in the merge request with how AI names or structures the code.
> he can tell what it generates is messy for long-term maintenance, even if it does work and even though he's new to React.
When one can generate code in such a short amount of time, logically it is not hard to maintain. You could just re-generate it if you didn't like it. I don't believe this style of argument where it's easy to generate with AI but then you cannot maintain it after. It does not hold up logically, and I have yet to see such a codebase where AI was able to generate it, but now cannot maintain it. What I have seen this year is feature-complete language and framework rewrites done by AI with these new tools. For me the unmaintainable code claim is difficult to believe.
it just allucinates packages, adds random functions that already exist, creates new random APIs.
How is that not unmantainable?
I started using it for things I hate, ended up using it everywhere. I move 5x faster. I follow along most of the time. Twice a week I realize I’ve lost the thread. Once a month it sets me back a week or more.
Professionally, I have had almost no luck with it, outside of summarizing design docs or literally just finding something in the code that a simple search might not find: such is this team's code that does X?
I am yet to successfully prompt it and get a working commit.
Further, I will add that I also don't know any ICs personally who have successfully used it. Though, there's endless posts of people talking about how they're now 10x more productive, and everyone needs to do x y an z now. I just don't know any of these people.
Non-professionally, it's amazing how well it does on a small greenfield task, and I have seen that 10x improvement in velocity. But, at work, close to 0 so far.
Of the posts I've seen at work, they typically tend to be teams doing something new / greenfield-ish or a refactor. So I'm not surprised by their results.
I’ve probably prompted 10,000 lines of working code in the last two months. I started with terraform which I know backwards and forwards. Works perfectly 95% of the time and I know where it will go wrong so I watch for that. (Working both green field, in other existing repos and with other collaborators)
Moved on to a big data processing project, works great, needed a senior engineer to diagnose one small index problem which he identified in 30s. (But I’d bonked on for a week because in some cases I just don’t know what I don’t know)
Meanwhile a colleague wanted a sample of the data. Vibe coded that. (Extract from zip without decompressing) He wanted randomized. One shot. Done. Then he wanted randomized across 5 categories. Then he wanted 10x the sample size. Data request completed before the conversion was over. I would have worked on that for three hours before and bonked if I hit the limit of my technical knowledge.
Built a monitoring stack. Configured servers, used it to troubleshoot dozens of problems.
For stuff I can’t do, now I can do. For stuff I could do with difficulty now I can do with ease. For stuff I could do easily now I can do fast and easy.
Your vastly different experience is baffling and alien to me. (So thank you for opening my eyes)
I see it generating between 50% to 90% accuracy in both small and large tasks, as in the PRs it generates range between being 50% usable code that a human can tweak, to 90% solution (with the occasional 100% wow, it actually did it, no comments, let's merge)
I also found it to be a skillset, some engineers seem to find it easier to articulate what they want and some have it easier to think while writing code.
I think I've amended that thought. They are not necessarily lacking in intelligence. I hypothesize that LLMs pick up on optimism and pessimism among other sentiments in the incoming prompt: someone prompting with no hope that the result will be useful end up with useless garbage output and vice versa.
It's pretty clear that people think greenfield projects can constantly be slopified and that AI will always be able to dig them another logical connection, so it doesn't matter which abstraction the AI chose this time; it can always be better.
This is akin to people who think we can just keep using oil to fuel technological growth because it'll some how improve the ability of technology to solve climate problems.
It's akin to the techno capitalist cult of "effective altruism" that assumes there's no way you could f'up the world that you can't fix with "good deeds"
There's a lot of hidden context in evaluating the output of LLMs, and if you're just looking at todays success, you'll come away with a much different view that if you're looking at next year's.
Optimism is only then, in this case, that you believe the AI will keep getting more powerful that it'll always clean up todays mess.
I call this techno magic, indistinguishable from religious 'optimism'
Once the plan is set, using the agentic coder to create smaller CLs has been the best avenue for me. You don't want to generate code faster than you and your reviewers can comprehend it. It'll feel slow, but check ins actually move faster.
I will say it's not all magic and success. I have had the AI lead me down some dark corners, assuring me one design would work when actually it is a bit outdated or not quite the right fit for the system we are building for because of reasons. So, I wouldn't really say that it's a 10x multiplier or anything, but I'm definitely getting things done faster than I could on my own. Expertise on the part of the user is still crucial.
One classic issue I used to run into, is doing a small refactor and then having to manually fix a bunch of tests. It is so much simpler to ask the LLM to move X from A to B and fix any test failures. Then I circle back in a few minutes to review what was done and fix any issues.
The other thing is, it has visibility for the wider code base, including some of our infrastructure that we're dependent on. There have been a couple times in the past quarter where our build is busted by an external team, and I am able to ask the LLM given the timeframe and a description of the issue, the exact external failure that caused it. I don't really know how long it would have taken to resolve the issue otherwise, since the issues were missed by their testing. That said, I gotta wonder if those breakages were introduced by LLM use.
My job hasn't been this fun in a long, long time and I am a little uneasy about what these tools are going to mean for my personal job security, but I don't know how we can put the genie back into the bottle at this point.
The FANG code basis are very large and date back years might not necessarily be using open source frameworks rather in house libraries and frameworks none of which are certainly available to Anthropic or OpenAI hence these models have zero visibility into them.
Therefore combined with the fact that these are not reasoning or thinking machines rather probabilistic (image/text) generators, they can't generate what they haven't seen.
It does work sometimes. The smaller the task, the better.
May I ask what you're working on?
Meta, despite competing with these, is open to let their devs use better off the shelf tools.
1. Degraded quality over longer context window usage. I have to think about managing context and agents instead of focusing solely on the task.
2. It’s slow (when it’s “thinking”). Especially when it’s tasked with something simple (e.g., I could ask Claude Opus to commit code and submit for review but it’s just faster if I run the commands myself and I don’t want to have to think about conditionally switching to Haiku / faster models mid task execution).
3. It often requires a lot of upfront planning and feedback loop set up to the extent that sometimes I wonder if it would’ve been faster if I did it myself.
A smarter model would be great but there are bigger productivity gains to be had with a good set up, a faster model, and abstracting away the need to think about agents or context usage. I’m still figuring out a good set up. Something with the speed of Haiku with the reasoning of Opus without the overhead of having to think about the management of agents or context would be sweet.
I've been working on this and landed on a pattern I call a "mechanical ledger", basically a structured state file that sits outside any context window and gets updated as a side effect of work, not as a step anyone remembers to do. Every commit writes to it, every failed patch writes to it, every test run writes to it. When a session starts (or an agent compacts), it reads the ledger and rebuilds context from ground truth instead of from memory.
Its not a novel idea really, its basically what ops teams do with runbooks and state files, but applied to the AI agent handoff problem. The interesting bit is making the updates mechanical so no agent can forget to do it.
I was thinking about this recently. This kind of setup is a Holy Grail everyone is searching for. Make the damn tool produce the right output more of the time. And yet, despite testing the methods provided by the people who claim they get excellent results, I still come to the point where the it gets off rails. Nevertheless, since practically everybody works on resolving this particular issue, and huge amounts of money have been poured into getting it right, I hope in the next year or so we will finally have something we can reliably use.
This year I grudgingly bit the bullet and began using AI tools, and to my dismay they've been a pretty big boon for me, in this case. Not just for code generation - they're really good at probing the monolith and answering questions I have about how it works. Before I'd spend days pouring over code before starting work to figure out the right way to build something or where to break in, pinging people over in India or eastern Europe with questions and hoping they reply to me overnight. AI's totally replaced that, and it works shockingly well.
When I do fall back on it for code generation, it's mostly just to mitigate the tedium of writing boilerplate. The code it produces tends to be pretty poor - both in terms of style and robustness - and I'll usually need to take at least a couple of passes over it to get it up to snuff. I do find this faster than writing everything out by hand in the end, but not by a lot.
For my personal projects I don't find it adds much, but I do enjoy rubber ducking with ChatGPT.
In fact it looks like an arising theme is that whenever we use these tools it's valuable to maintain a human understanding of what's actually going on.
I'm lucky enough to have upper management not pressuring to use it this or that way, and I'm using mostly to assist with programming languages/frameworks I'm not familiar with. Also, test cases (these sometimes comes wrong and I need to review thoroughly), updating documentation, my rubber duck, and some other repetitive/time consuming tasks.
Sometimes, if I have a simple, self-contained bug scenario where extensive debug won't be required, I ask it to find the reason. I have a high rate of success here.
However, it will not help you with avoiding anti-patterns. If you introduce one, it will indulge instead of pointing the problem out.
I did give it a shot on full vibe-coding a library into production code, and the experience was successful; I'm using the library - https://youtu.be/wRpRFM6dpuc
We have cursor with essentially unlimited Opus 4.6 and it’s fundamentally changed my workflow as a senior engineer. I find I spend much more time designing and testing my software and development time is almost entirely prompting and reviewing AI changes.
I’m afraid my coding skills are atrophying, in fact I know the are, but I’m not sure if the coding was the part of my job I truly enjoyed. I enjoy thinking higher-level: architecture, connecting components, focusing on the user experience. But I think using these AI tools is a form of golden handcuffs. If I go work at a startup without the money I pay for these models, I think for the first time in my career I would be less likely to be able to successfully code a feature than I could last year.
So professionally there are pros and cons. My design and architecture skills have greatly improved as I am spending more time doing this.
Personally it’s so much fun. I’ve made several side projects I would have never done otherwise. Working with Claude code on greenfield projects is a blast.
Particularly in situations where you might have to navigate a change in jobs and get back to the point where you can reasonably prove that you can program at a professional level (will be interesting to see how/if the interviewing process changes over time due to LLMs).