Top
Best
New

Posted by bigwheels 1/26/2026

A few random notes from Claude coding quite a bit last few weeks(twitter.com)
https://xcancel.com/karpathy/status/2015883857489522876
391 points | 380 commentspage 4
siliconc0w 1/27/2026|
Not sure how he is measuring, I'm still closer to about a 60% success rate. It's more like 20% is an acceptable one-shot, this goes to 60% acceptable with some iteration, but 40% either needs manual intervention to succeed or such significant iteration that manual is likely faster.

I can supervise maybe three agents in parallel before a task requiring significant hand-holding means I'm likely blocking an agent.

And the time an agent is 'restlessly working' on something in usually inversely correlated with the likelihood to succeed. Usually if it's going down a rabbit hole, the correct thing to do is to intervene and reorient it.

alexose 1/27/2026||
It's refreshing to see one of the top minds in AI converge on the same set of thoughts and frustrations as me.

For as fast as this is all moving, it's good to remember that most of us are actually a lot closer to the tip of the spear than we think.

TheGRS 1/27/2026||
I do feel a big mood shift after late November. I switched to using Cursor and Gemini primarily and it was big change in my ability to get my ideas into code effectively. The Cursor interface for one got to a place that I really like and enjoy using, but its probably more that the results from the agents themselves are less frustrating. I can deal with the output more now.

I'm still a little iffy on the agent swarm idea. I think I will need to see it in action in an interface that works for me. To me it feels like we are anthropomorphizing agents too much, and that results in this idea that we can put agents into roles and them combine them into useful teams. I can't help seeing all agents as the same automatons and I have trouble understanding why giving an agent with different guideliens to follow, and then having them follow along another agent would give me better results than just fixing the context in the first place. Either that or just working more on the code pipeline to spot issues early on - all the stuff we already test for.

all2well 1/27/2026||
What particular setups are getting folks these sorts of results? If there’s a way I could avoid all the babysitting I have to do with AI tools that would be welcome
geraneum 1/27/2026||
> If there’s a way I could avoid all the babysitting I have to do with AI tools that would be welcome

OP mentions that they are actually doing the “babysitting”

spongebobstoes 1/27/2026||
i use codex cli. work on giving it useful skills. work on the other instruction files. take Karpathy tips around testing and declarativeness

use many simultaneously, and bounce between them to unblock them as needed

build good tools and tests. you will soon learn all the things you did manually -- script them all

daxfohl 1/27/2026||
I'm curious to see what effect this change has on leadership. For the last two years it's been "put everything you can into AI coding, or else!" with quotas and firings and whatever else. Now that AI is at the stage where it can actually output whole features with minimal handholding, is there going to be a Frankenstein moment where leadership realizes they now have a product whose codebase is running away from their engineering team's ability to support it? Does it change the calculus of what it means to be underinvested vs overinvested in AI, and what are the implications?
vibeprofessor 1/27/2026||
The AGI vibes with Claude Code are real, but the micromanagement tax is heavy. I spend most of my time babysitting agents.

I expect interviews will evolve into "build project X with an LLM while we watch" and audit of agent specs

maxdo 1/27/2026||
I've been doing vibe code interviews for nearly a year now. Most people are surprisingly bad with AI tools. We specifically ask them to bring their preferred tool, yet 20–30% still just copy-paste code from ChatGPT.

fun stats: corelation is real, people who were good at vibe code, also had offer(s) with other companies that didn't run vibe code interviews.

xyzsparetimexyz 1/27/2026|||
Copy pasting from chatgpt is the most secure option.
bflesch 1/27/2026|||
Interesting you say that, feels like when people were too stupid to google things and "googling something" was a skill that some had and others didn't.
thefourthchime 1/27/2026|||
From what I've heard, what few interviews there are for software engineers these days, they do have you use models and see how quickly you can build things.
iwontberude 1/27/2026||
The interviews I’ve given have asked about how control for AI slop without hurting your colleagues feelings. Anyone can prompt and build, the harder part, as usual for business, is knowing how and when to say, ‘no.’
0xy 1/27/2026||
Sounds great to me. Leetcode is outdated and heavily abused by people who share the questions ahead of time in various forums and chats.
tomlockwood 1/28/2026||
Oh wow! Guy who's current project depends on AI being good is talking about AI being good.

Interesting.

ositowang 1/27/2026||
It’s a great and insightful review—not over-hyping the coding agent, and not underestimating it either. It acknowledges both its usefulness and its limitations. Embracing it and growing with it is how I see it too.
forrestthewoods 1/27/2026||
HN should ban any discussion on “things I learned playing with AI” that don’t include direct artifacts of the thing built.

We’re about a year deep into “AI is changing everything” and I don’t see 10x software quality or output.

Now don’t get me wrong I’m a big fan of AI tooling and think it does meaningfully increase value. But I’m damn tired of all the talk with literally nothing to show for it or back it up.

lomase 1/27/2026|
[dead]
sota_pop 1/28/2026|
> Slopacolypse Really… REALLY not looking forward to getting this word spammed at me the next 6-12 months… even less so seeing the actual manifestation.

> TLDR This should be at the start?

I actually have been thinking of trying out ClaudeCode/OpenCode over this past week… can anyone provide experience, tips, tricks, ref docs?

My normal workflow is using Free-tier ChatGPT to help me interrogate or plan my solution/ approach or to understand some docs/syntax/best practice of which I’m not familiar. then doing the implementation myself.

gverrilla 1/28/2026|
Claude code official docs are quite nice - that's where I started.
More comments...