What Claude Code Chooses

Posted by tin7in 5 hours ago

140 points | 73 commentspage 2

nineteen999 3 hours ago|

This seems web centric and I expect that colors the decision making during this analysis somewhat.

People are using it for all kinds of other stuff, C/C++, Rust, Golang, embedded. And of course if you push it to use a particular tool/framework you usually won't get much argument from it.

mjheadd 3 hours ago||

Worth reading alongside recent research on AGENTS.md file effectiveness. The clearest use case for these files isn't describing your codebase, it's overriding default behavior. If your project has specific requirements around tooling (common in government and regulated industries), that's exactly what belongs in the AGENTS.md files.

esafak 2 hours ago||

It still ignores it. I always have to say 'Isn't this mentioned in AGENTS??' and it will concede that it is.

matheus-rr 1 hour ago||

In my experience the problem is how people write them. Descriptive statements get ignored because the model treats them as context it can reason past.

"We use PostgreSQL" reads as a soft preference. The model weighs it against whatever it thinks is optimal and decides you'd be better off with Supabase.

"NEVER create accounts for external databases. All persistence uses the existing PostgreSQL instance. If you're about to recommend a new service, stop." actually sticks.

The pattern that works: imperative prohibitions with specific reasoning. "Do not use Redis because we run a single node and pg_notify covers our pubsub needs" gives enough context that it won't reinvent the decision every session.

Your AGENTS.md should read less like a README and more like a linter config. Bullet points with DO/DON'T rules, not prose descriptions of your stack.

toraway 1 hour ago||

Hah, it's somewhat ironic how this is almost the exact opposite of the prevailing folk wisdom I've read for the last 1-2 years: that you should never use negative instructions with specific details because it overweights the exact thing you're trying to avoid in the context.

Given my own experience futilely fighting with Claude/Codex/OpenCode to follow AGENTS.MD/CLAUDE.MD/etc with different techniques that each purport to solve the problem, I think the better explanation really is that they just don't work reliably enough to depend on to enforce rules.

matheus-rr 51 minutes ago||

Fair point on the contradiction. The "never use negative instructions" wisdom comes from general prompting where mentioning the unwanted thing can increase its likelihood. AGENTS.md is a different context though, the model is reading persistent rules for a session, not doing a single completion where priming effects matter as much.

But you're right that "better" isn't "reliable." In practice it went from "constantly ignored" to "followed maybe 80% of the time." The remaining 20% is the model encountering situations where it decides the instruction doesn't apply to this specific case.

Honest answer is probably somewhere between "they don't work" and "write them right and you're fine." They raise the floor but don't guarantee anything. I still use them because 80% beats 20%, but I wouldn't bet production correctness on them.

zzixp 3 hours ago||

Have any links?

ripped_britches 2 hours ago||

> Traditional cloud providers got zero primary picks

Good - all of them have a horrible developer experience.

Final straw for me was trying to put GHA runners in my Azure virtual net and spent 2 weeks on it.

rishabhaiover 4 hours ago||

I found it a remarkable transition to not use Redis for caching from Sonnet 4.5 to Opus 4.6. I wonder why that is the case? Maybe I need to see the code to understand the use case of the cache in this context better.

NiloCK 3 hours ago||

I'll be interested to hear stories - down the line - from the participants in the the LLM SEO war [1].

Interesting that tailwind won out decisively in their niche, but still has seen the business ravaged by LLMs.

[1] https://paritybits.me/copilot-seo-war/

0x457 3 hours ago|

It's like tailwindcss was purposely designed to be managed my LLM.

dmix 3 hours ago||

LLMs are going to keep React alive for the indefinite future.

Especially with all the no-code app building tools like Lovable which deal with potential security issues of an LLM running wild on a server, by only allowing it to build client-side React+Vite app using Supabase JWT.

ch4s3 2 hours ago||

It really disappointing to see it so strongly preferring Github Actions which is in my experience terrible. Almost everything about GHA pushes you in the direction of constantly blowing out the 10GB cache limit in an attempt to have CI not run for ages. I also feel like the standard cache action using git works poorly with any tools that use mtime on files to determine freshness.

I guess at least Opus can help you muddle through GHA being so crappy.

nhumrich 2 hours ago|

It has one thing going for it: Setup.

And by setup I mean, integration and account creation. You don't have to do it. You already have a git repo, just add some yaml, and bobs your uncle.

ch4s3 1 hour ago||

It’s very Microsoft in that way.

WA 4 hours ago||

Not sure what to make of this. React is missing entirely. Or is this report also assuming that React is the default for everything and not worth mentioning at all? Just like shadcn/ui's first mention of React is somewhere down the page or hidden in the docs?

Furthermore, what's the point of "no tools named"? Why would I restrict myself like that? If I put "use Nodejs, Hono, TypeScript and use Hono's html helper to generate HTML on the server like its 2010, write custom CSS, minimize client-side JS, no Tailwind" in CLAUDE.md, it happily follows this.

godtoldmetodoit 3 hours ago||

As someone who runs a small dev agency, I'm very interested in research like this.

Let's say some Doctor decides to vibecode an app on the weekend, with next to 0 exposure to software development until she started hearing about how easy it was to create software with these tools. She makes incredible progress and is delighted in how well it works, but as she considers actually opening it up the world she keeps running into issues. How do I know this is secure? How do I keep this maintained and running?

I want to be in a position where she can find me to get professional help, so it's very helpful to know what stacks these kinds of apps are being built in.

chasd00 2 hours ago|||

claudecode _loves_ shadcn/ui. I hadn't even heard of it until i was playing around with claudecode. It seems fine to me and if the coding agent loves it then more power to it, i don't really care. That's the problem.

I think that makes coding agent choices extremely suspect, like i don't really care what it uses as long as what's produced works and functions inline with my expectations. I can totally see companies paying Anthropic to promote their tool of choice to the top of claudecodes preferences. After thinking about it, i'm not sure if that's a problem or not. I don't really care what it uses as long as my requirements (all of them) are met.

furyofantares 4 hours ago|||

> Furthermore, what's the point of "no tools named"?

There are vibe coders out there that don't know anything about coding.

nineteen999 3 hours ago||

I mean, i guess that will shortly put an end to the "no code" movement.

skywhopper 2 hours ago||

Because the primary and future audience of Claude et al don’t know the tools they want, or even that a choice exists.

almosthere 4 hours ago||

I didn't read the report just the "finding" - but at least for launchdarkly it's nice that it chose a roll-your-own, i hate feature flag SaaS, but that's just me

elophanto_agent 4 hours ago|

[flagged]

RyanShook 3 hours ago|

Bot comment