People are using it for all kinds of other stuff, C/C++, Rust, Golang, embedded. And of course if you push it to use a particular tool/framework you usually won't get much argument from it.
"We use PostgreSQL" reads as a soft preference. The model weighs it against whatever it thinks is optimal and decides you'd be better off with Supabase.
"NEVER create accounts for external databases. All persistence uses the existing PostgreSQL instance. If you're about to recommend a new service, stop." actually sticks.
The pattern that works: imperative prohibitions with specific reasoning. "Do not use Redis because we run a single node and pg_notify covers our pubsub needs" gives enough context that it won't reinvent the decision every session.
Your AGENTS.md should read less like a README and more like a linter config. Bullet points with DO/DON'T rules, not prose descriptions of your stack.
Given my own experience futilely fighting with Claude/Codex/OpenCode to follow AGENTS.MD/CLAUDE.MD/etc with different techniques that each purport to solve the problem, I think the better explanation really is that they just don't work reliably enough to depend on to enforce rules.
But you're right that "better" isn't "reliable." In practice it went from "constantly ignored" to "followed maybe 80% of the time." The remaining 20% is the model encountering situations where it decides the instruction doesn't apply to this specific case.
Honest answer is probably somewhere between "they don't work" and "write them right and you're fine." They raise the floor but don't guarantee anything. I still use them because 80% beats 20%, but I wouldn't bet production correctness on them.
Good - all of them have a horrible developer experience.
Final straw for me was trying to put GHA runners in my Azure virtual net and spent 2 weeks on it.
Interesting that tailwind won out decisively in their niche, but still has seen the business ravaged by LLMs.
Especially with all the no-code app building tools like Lovable which deal with potential security issues of an LLM running wild on a server, by only allowing it to build client-side React+Vite app using Supabase JWT.
I guess at least Opus can help you muddle through GHA being so crappy.
And by setup I mean, integration and account creation. You don't have to do it. You already have a git repo, just add some yaml, and bobs your uncle.
Furthermore, what's the point of "no tools named"? Why would I restrict myself like that? If I put "use Nodejs, Hono, TypeScript and use Hono's html helper to generate HTML on the server like its 2010, write custom CSS, minimize client-side JS, no Tailwind" in CLAUDE.md, it happily follows this.
Let's say some Doctor decides to vibecode an app on the weekend, with next to 0 exposure to software development until she started hearing about how easy it was to create software with these tools. She makes incredible progress and is delighted in how well it works, but as she considers actually opening it up the world she keeps running into issues. How do I know this is secure? How do I keep this maintained and running?
I want to be in a position where she can find me to get professional help, so it's very helpful to know what stacks these kinds of apps are being built in.
I think that makes coding agent choices extremely suspect, like i don't really care what it uses as long as what's produced works and functions inline with my expectations. I can totally see companies paying Anthropic to promote their tool of choice to the top of claudecodes preferences. After thinking about it, i'm not sure if that's a problem or not. I don't really care what it uses as long as my requirements (all of them) are met.
There are vibe coders out there that don't know anything about coding.