I'm dialing back my LLM usage

Posted by sagacity 1 day ago

388 points | 226 commentspage 2

dwoldrich 1 day ago|

I personally believe it's a mistake to invite AI into your editor/IDE. Keep it separate to the browser, keep discrete, concise question and answer threads. Copy and paste whenever it delivers some gold (that comes with all the copy-pasta dangers, I know - oh, don't I know it!)

It's important to always maintain the developer role, don't ever surrender it.

obirunda 1 day ago||

The dichotomy between the people who are "orchestrating" agents to build software and the people experiencing this less than ideal outcomes from LLMs is fascinating.

I don't think LLM for coding productivity is all hype but I think for the people who "see the magic" there are many illusions here similar to those who fall prey to an MLM pitch.

You can see all the claims aren't necessarily unfounded, but the lack of guaranteed reproducibility leaves the door open for many caveats in favor of belief for the believer and cynicism for everybody else.

For the believers if it's not working for one person, it's a skill issue related to providing the best prompt, the right rules, the perfect context and so forth. At what point is this a roundabout way of doing it yourself anyway?

hedgehog 1 day ago||

Over the last couple months I've gone from highly skeptical to a regular user (Copilot in my case). Two big things changed: First, I figured out that only some models are good enough to do the tasks I want (Claude Sonnet 3.7 and 4 out of everything I've tested). Second, it takes some infrastructure. I've added around 1000 words of additional instructions telling Copilot how to operate, and that's on top of tests (which you should have anyway) and 3rd party documentation. I haven't tried the fleet-of-agents thing, one VS Code instance is enough and I want to understand the changes in detail.

Edit: In concrete terms the workflow is to allow Copilot to make changes, see what's broken, fix those, review the diff against the goal, simplify the changes, etc, and repeat, until the overall task is done. All hands off.

sarmadgulzar 1 day ago||

Can relate. I've also shifted towards generating small snippets of code using LLMs, giving them a glance, and asking to write unit tests for them. And then I review the unit tests carefully. But integrating the snippets together into the bigger system, I always do that myself. LLMs can do it sometimes but when it becomes big enough that it can't fit into the context window, then it's a real issue because now LLMs doesn't know what's going on and neither do you. So, I'll advise you to use LLMs to generate tedious bits of code but you must have the overall architecture committed into your memory as well so that when AI messes up, at least you have some clue about how to fix it.

causal 1 day ago|

What's it called when you choose a task because it's easy, even if it's not what you need to do at all? I think that's what LLMs have activated in a lot of us: writing code used to be kinda hard, but now it's super easy, so let's just write more code.

The hard parts of engineering have always been decision making, socializing, and validating ideas against cold hard reality. But writing code just got easier so let's do that instead.

Prior to LLMs writing 10 lines of code might have been a really productive day, especially if we were able to thoughtfully avoid writing 1,000 unnecessary lines. LLMs do not change this.

Velorivox 1 day ago||

I'm not sure if there's a name for that specifically, but it seems strongly related to the streetlight effect. [0]

[0] https://en.wikipedia.org/wiki/Streetlight_effect

causal 22 hours ago||

Yeah very apt

travisgriggs 1 day ago||

This parrots much of my own experience.

I don’t have it write of my Python firmware or Elixir backend stuff.

What I do let it rough in is web front end stuff. I view the need for and utility of LLMs in the html/css/tailwind/js space as an indictment of complexity and inconsistency. It’s amazing that the web front end stuff has just evolved over the years, organically morphing from one thing to another, but a sound well engineered simple-is-best set of software it is not. And in a world where my efforts will probably work in most browser contexts, no surprise that I’m willing to mix in a tool that will make results that will probably work. A mess is still a mess.

ivraatiems 17 hours ago||

The way that LLMs are used/are encouraged by business right now is evidence that they are mostly being pushed by people who don't understand software. I don't know very many actual software engineers who advocate for vibe-coding or using LLMs this way. I know ton of engineers who advocate using them as helpful tools, myself included (in fact I changed my opinion on it as their capabilities grew, and I'll continue to do so).

Every tool is just a tool. No tool is a solution. Until and unless we hit AGI, only the human brain is that.

kadhirvelm 1 day ago||

I wonder if this is as good as LLMs can get, or if this is a transition period between LLM as an assistant, and LLM as a compiler. Where in the latter world we don’t need to care about the code because we just care about the features. We let the LLM deal with the code and we deal with the context, treating code more like a binary. In that world, I’d bet code gets the same treatment as memory management today, where only a small percent of people need to manage it directly and most of us assume it happens correctly enough to not worry about it.

rzz3 1 day ago|

Why wonder if this is “as good as LLMs can get” when we saw such a huge improvement between Claude 3.7 and Claude 4, released what, a couple weeks ago? Of course it isn’t as good as LLMs can get. Give it 3 more weeks and you’ll see it get better again.

kadhirvelm 1 day ago||

I don’t doubt LLMs will become better assistants over time, as you said every few weeks. I more mean if LLMs will cross the assistant to compiler chasm where we don’t have to think about the code anymore and can focus on just the features

kadhirvelm 17 hours ago||

Wrote more thoughts here: https://resync-games.com/blog/engineering/llms-as-compiler

kamens 1 day ago||

Personally, I follow the simple rule: "I type every single character myself. The AI/agent/etc offers inspiration." It's an effective balance between embracing what the tech can do (I'm dialing up my usage) and maintaining my personal connection to the code (I'm having fun + keeping things in my head).

I wrote about it: https://kamens.com/blog/code-with-ai-the-hard-way

MrGilbert 1 day ago||

My point of view: LLMs should be taken as a tool, not as a source of wisdom. I know someone who likes to answer people-related questions through a LLM. (E.g.: "What should this person do?" "What should we know about you?" etc.) More than once, this leads to him getting into a state of limbo when he tries to explain what he means with what he wrote. It feels a bit wild - a bit like back in school, when the guy who copied your homework, is forced to explain how he ended up with the solution.

Nedomas 1 day ago|

two weeks ago I started heavily using Codex (I have 20y+ dev xp).

At first I was very enthusiastic and thought Codex is helping me multiplex myself. But you actually spend so much time trying to explain Codex the most obvious things and it gets them wrong all the time in some kind of nuanced way that in the end you spend more time doing things via Codex than by hand.

So I also dialed back Codex usage and got back to doing many more things by hand again because its just so much faster and much more predictable time-wise.

nsingh2 1 day ago|

Same experience, these "background agents" are powered by models that aren't yet capable enough to handle large, tangled or legacy codebases without human guidance. So the background part ends up being functionally useless in my experience.

More comments...