You mentioned "harness engineering". How do you approach building "actual programmed tools" (like screenshot scripts) specifically for an LLM's consumption rather than a human's? Are there specific output formats or constraints you’ve found most effective?
That said, I've given it a go. I used zed, which I think is a pretty great tool. I bought a pro subscription and used the built in agent with Claude Sonnet 4.x and Opus. I'm a Rails developer in my day job, and, like MitchellH and many others, found out fairly quickly that tasks for the LLM need to be quite specific and discrete. The agent is great a renames and minor refactors, but my preferred use of the agent was to get it to write RSpec tests once I'd written something like a controller or service object.
And generally, the LLM agent does a pretty great job of this.
But here's the rub: I found that I was losing the ability to write rspec.
I went to do it manually and found myself trying to remember API calls and approaches required to write some specs. The feeling of skill leaving me was quite sobering and marked my abandonment of LLMs and Zed, and my return to neovim, agent-free.
The thing is, this is a common experience generally. If you don't use it, you lose it. It applies to all things: fitness, language (natural or otherwise), skills of all kinds. Why should it not apply to thinking itself.
Now you may write me and my experience off as that of a lesser mind, and that you won't have such a problem. You've been doing it so long that it's "hard-wired in" by now. Perhaps.
It's in our nature to take the path of least resistance, to seek ease and convenience at every turn. We've certainly given away our privacy and anonymity so that we can pay for things with our phones and send email for "free".
LLMs are the ultimate convenience. A peer or slave mind that we can use to do our thinking and our work for us. Some believe that the LLM represents a local maxima, that the approach can't get much better. I dunno, but as AI improves, we will hand over more and more thinking and work to it. To do otherwise would be to go against our very nature and every other choice we've made so far.
But it's not for me. I'm no MitchellH, and I'm probably better off performing the mundane activities of my work, as well as the creative ones, so as to preserve my hard-won knowledge and skills.
YMMV
I'll leave off with the quote that resonates the most with me as I contemplate AI:-
"I say your civilization, because as soon as we started thinking for you, it really became our civilization, which is, of course, what this is all about." -- Agent Smith "The Matrix"
- When tests didn't work I had to check what was going on and the LLMs do cheat a lot with Volkswagen tests, so that began to make me skeptic even of what is being written by the agents
- When things were broken, spaghetti and awful code tends to be written in an obnoxius way it's beyond repairable and made me wish I had done it from scratch.
Thankfully I just tried using agents for tests and not for the actual code, but it makes me think a lot if "vibe coding" really produces quality work.
This is the main reason to use AI agents, though: multitasking. If I'm working on some Terraform changes and I fire off an agent loop, I know it's going to take a while for it to produce something working. In the meantime I'm waiting for it to come back and pretend it's finished (really I'll have to fix it), so I start another agent on something else. I flip back and forth between the finished runs as they notify me. At the end of the day I have 5 things finished rather than two.
The "agent" doesn't have to be anything special either. Anything you can run in a VM or container (vscode w/copilot chat, any cli tool, etc) so you can enable YOLO mode.
I think this is something people ignore, and is significant. The only way to get good at coding with LLMs is actually trying to do it. Even if it's inefficient or slower at first. It's just another skill to develop [0].
And it's not really about using all the plugins and features available. In fact, many plugins and features are counter-productive. Just learn how to prompt and steer the LLM better.
[0]: https://ricardoanderegg.com/posts/getting-better-coding-llms...
Versus other threads (here on HN, and especially on places like LinkedIn) where it's "I set up a pipeline and some agents and now I type two sentences and amazing technology comes out in 5 minutes that would have taken 3 devs 6 months to do".
I just recently added in Codex, since it comes with my $20/mo subscription to GPT and that's lowering my Claude credit usage significantly... until I hit those limits at some point.
2012 + 300 + 5~200... so about $1500-$1600/year.
It is 100% worth it for what I'm building right now, but my fear is that I'll take a break from coding and then I'm paying for something I'm not using with the subscriptions.
I'd prefer to move to a model where I'm paying for compute time as I use it, instead of worrying about tokens/credits.