Posted by phire 5 days ago
I’m not sure about that: the code LLMs generate isn’t categorically worse than that written by people who no longer work here and that I can’t ask anything to either. It’s also not much better or worse than what you’d find online but has a more broad reach than my Google-fu, alongside some hallucinations. At the same time, AI doesn’t hate writing tests because it doesn’t get a choice. It doesn’t get breaks and doesn’t half ass things any more or less depending on how close to 5 PM it is.
Maybe my starting point is viewing all code as a liability and not trusting anything anyone (myself included) has written all that much, so the point doesn’t resonate with me that much. That said I have used AI to push out codebases that work, albeit that did take a testable domain and a lot of iteration.
It produces results but also rots my brain somewhat because the actual part of writing code becomes less of a mentally stimulating activity compared to requirements engineering.
I think the premise is true that writing code was never the "main" bottleneck but like any power tool, when wielded by the right person, it can blow past bottlenecks.
many of these arguments,only assume the case of an inexperienced engineer blindly pumping out and merging code. I concede the problems in this case.
but put this to test with more experienced engineers. how has/does it change their workflows? the results (I've personally observed) are exponentially different.
---
> LLMs reduce the time it takes to produce code, but they haven’t changed the amount of effort required to reason about behavior, identify subtle bugs, or ensure long-term maintainability.
I have to strongly disagree here. this argument doesn't apply universally. I've actually found LLMs to make it easier to understand large swaths of code, faster. especially in larger codebases that have legacy code that no one has worked on or dared to touch. LLMs bring an element of fearlessness, which makes it easier to effect change.
If you have written about your workflow related to this outcome, appreciate if you share.
and fwiw, i'm also not alone in this observation. I can at least remember 2 times in the last month that, other colleagues have cited this exact same benefit.
e.g - a complicated algo that someone wrote 3 years ago, that's working well enough but has always had subtle bugs. over a 2 day workshop, we start first by writing a bunch of (meaningful) tests with an LLM. then ask the LLM about portions of the code and piecing together why a certain bit of logic existed or was written a certain way, add more tests to confirm working behavior, then start refactoring and changing the algo (also with an LLM).
much of this is similar to how we'd do it without LLMs. but no one has bothered to improve/change it cause the time investment & ROI didn't make sense (let alone the cognitive burden in gathering context from git logs or old timers who have nuggets of context that could be pieced together). with LLMs a lot of that friction can be reduced.
Introducing a lever to suddenly produce more code faster creates an imbalance in the SDLC. If our review process was already a bottleneck, now that problem is even worse! If the review bottleneck was something we could tolerate/ignore before, that's no longer the case, we need to solve for it. No, that doesn't mean let some LLM review the code and ship it. CI/CD needs to get better and smarter. As a reviewer, I don't want to be on the lookout for obscure edge cases. I want to make sure my peer solved the problem in a way that makes sense for our team. CI/CD should take care of making sure the code style aligns with our policies, that new/updated tests provide enough coverage for the new/changed functionality, and that the feature actually works.
The code expertise / shared context is another tough problem that needs solving, only highlighted by introducing a random graph of numbers generating the code. Leaning on that one engineer who has been on the team for 30 years and knows where all the deep dark secrets are was not a sustainable path even before coding agents. Having a markdown file that just says "component foo is under /foo. Run make foo to test it" was not documentation. The imbalance in the SDLC will light the fire under our collective asses to provide proper developer documentation and tooling for our codebases. I don't know what that looks like yet. Some teams are trying to have *good* markdown files that actually document where all the deep dark secrets are. These are doubly beneficial because coding agents can use those as well as your humans. But better markdown is probably a small step towards the real fix which we wont be able to live without in the near future.
Anyway, great points brought up in the article. Coding agents aren't going away, so we need to solve this imbalance in the SDLC. Fight fire with fire!
I have watched for almost 20 years employers try to solve and cheat their way around this low confidence. The result is always the same: some shitty form of pattern copy/paste, missing originality, and delivery timelines for really basic features. The reasons for this is that nobody wants to invest in training/baselines and great fear that if they do have something perceived as talent that its irreplaceable and can leave.
My current job in enterprise API management is the first time where the bottleneck is different. Clearly the bottleneck is the customer’s low confidence, as opposed to the developers, and manifests as a very slow requirements gathering process.
It’s decisions.
Ninety five percent is all the decisions made from every person involved.
The fastest delivery I ever encountered were situations where the stakeholders were intimately familiar with the problem and made quick decisions. Only then did speed of coding affect delivery.
In large organizations, PMs are rarely POs. Every decision needs to be run up the flagpole and through committee with CYAs and delays at every step.
Decision makers are outsourcing this to LLMs now which is scary as they are supposed to be the SME.
It’s the same old same old where the generals make decisions but the sergeants (NCOs) really run the army. That’s where I feel leads/principles/staff really make out break the product. They are the fulcrum dealing with LLM from above and below.
The point is that there's a human element to code that we can't even capture when working remotely with intelligent humans. LLMs will always be like a remote worker.
I recently started working on a client's project where we were planning on hiring a designer to build the front-end UI. Turns out, Gemini can generate really good UIs. Now we're saving a lot of time because I don't have to wait on the designer to provide designs before I can start building. The cost savings are most welcome as well.
Coding is definitely a bottleneck because my client still needs my help to write code. In the future, non-programmers should be able to build products on their own.
I don’t think there’s enough distinction between using LLMs for frontend and backend in discussions similar to these.
Using it for things like CSS/Tailwind/UI widget layout seems like a low risk timesaver.
I can relate :-)
Our team maintains a configuration management database for a company that has grown mostly organically from 3 to 500+ employees in ~30 years.
We don't have documented processes that would account for most of the write operations, so if we have a question, we cannot just talk to the process owner.
The next option would be to talk to the data owner, but for many of our entities, we don't have a data owner. So we look into the audit logs to see which teams often touch the data, and then we do a meeting with some senior folks from each of these teams to discuss things.
But of course, finding common meeting time slots with several senior people from several teams isn't easy, they're all busy. So that alone might delay something by a few weeks to months.
For low-stakes decisions, we often try to not go through this effort, but instead do things that are easy to roll back if they go wrong.
Once we have identified the stakeholders, have a common understanding among them, and a rough consensus on how to proceed, the actual code changes are often relatively simple in comparison.
So, I guess this falls under "overhead of coordination and communication".
I'd argue that they're slowly changing that as well -- you can ask an LLM to "read" code, summarize / review / criticize it. At the least, it can help accelerate onboarding onto new / unfamiliar codebases.
Autocomplete speeds up code generation by an order of magnitude, easily, with no real downside when used by experienced devs. Vibe coding on the other hand completely replaces the programmer and causes lots of new issues.
Strongly disagree. Autocomplete thinks slower than I do, so if I want to try and take advantage of it I have to slow myself down a bunch
Instead of just writing a function, I write a line or two, wait to see what the auto complete suggests, read it, understand it, often realize it is wrong and then keep typing. Then it suggests something else, rinse, repeat
I get negative value from it and turned it off eventually. At least intellisense gives instant suggestions ...
In all other cases, autocomplete was never faster than writing it myself.
Full code generation, where you can step away from the computer for 20 minutes, does save me time.