Posted by derek 7/1/2025
You have to catch most issues at code review.
You can have an agent spit out code, but on PR open you should have another agent verify rules based on rules you define as a team.
It's what I'm building at wispbit.
I guess keep them on backend/library tasks for now. I am sure the companies are already working on getting a snapshot of a browser page and feeding it back into multimodal model so it can comprehend what "looking" means.
Can someone convince me they're doing their due-diligence on this code if they're using this approach? I am smart and I am experienced, and I have trouble keeping on top of the changes and subtle bugs being created by one Claude Code.
To the author & anyone reading - publicly release your agent harnesses, even if its shit or vibe coded! I am constantly iterating on my meta and seeking to improve.
yes AI assisted workflow might be here to stay but it won't be the magical put programmers out of job thing.
And this the best product market fit for LLMs. I imagine it will be even worse in other domains.
This is the absolute polar opposite from my experience. I'm in a large non-tech community with a coders channel, and every day we get a few more Claude Code converts. I would say that vibe-coding is moving into the main-stream with experienced, professional developers who were deeply skeptical a few months ago. It's no longer fancy auto-complete: I have myself seen the magic of wishing a (low importance) front-end app into existence from scratch in an hour or so that would have taken me an order of magnitude more time beforehand.
In the end, it had written 500 lines, the problem was still there, and the code didn't work any differently. It worries me that I don't know what those 500 lines were for.
In my experience, LLMs are amazing for writing 10-20 lines at a time, while you review and fix any errors. If I let them go to town on my code, I've found that's an expensive way to get broken code.
For sure, and me neither, for what it's worth. But most of the code I write isn't "hard" code; the hard code is also the stuff I enjoy writing the most. I will note that a few months ago I found them helpful for small things inside the GPT window, and then tried agentic mode (specifically Roo, then Claude Code), and have seen a huge speedup in my ability to get stuff done.
who does this though ? maybe you should extract that into a library/method/abstraction ?
https://www.reddit.com/r/ClaudeAI/comments/1loj3a0/this_pret...
This is a very interesting concept
Could this be extended to the point of an LLM producing/improving itself?
If not, what are the current limitations to get to that point?
Check out aider writing aider stats here: https://aider.chat/HISTORY.html
Aider writing its own code is definitely cool and within the same concept
I’d love to see an LLM or some sort of coding model that modifies/trains the model itself