Top
Best
New

Posted by bigwheels 1 day ago

A few random notes from Claude coding quite a bit last few weeks(twitter.com)
https://xcancel.com/karpathy/status/2015883857489522876
257 points | 280 commentspage 4
vibeprofessor 14 hours ago|
The AGI vibes with Claude Code are real, but the micromanagement tax is heavy. I spend most of my time babysitting agents.

I expect interviews will evolve into "build project X with an LLM while we watch" and audit of agent specs

maxdo 7 hours ago||
I've been doing vibe code interviews for nearly a year now. Most people are surprisingly bad with AI tools. We specifically ask them to bring their preferred tool, yet 20–30% still just copy-paste code from ChatGPT.

fun stats: corelation is real, people who were good at vibe code, also had offer(s) with other companies that didn't run vibe code interviews.

xyzsparetimexyz 2 hours ago|||
Copy pasting from chatgpt is the most secure option.
bflesch 3 hours ago|||
Interesting you say that, feels like when people were too stupid to google things and "googling something" was a skill that some had and others didn't.
thefourthchime 7 hours ago|||
From what I've heard, what few interviews there are for software engineers these days, they do have you use models and see how quickly you can build things.
iwontberude 7 hours ago||
The interviews I’ve given have asked about how control for AI slop without hurting your colleagues feelings. Anyone can prompt and build, the harder part, as usual for business, is knowing how and when to say, ‘no.’
0xy 14 hours ago||
Sounds great to me. Leetcode is outdated and heavily abused by people who share the questions ahead of time in various forums and chats.
maximedupre 6 hours ago||
> It hurts the ego a bit but the power to operate over software in large "code actions" is just too net useful

It does hurt, that's why all programmers now need an entrepreneurial mindset... you become if you use your skills + new AI power to build a business.

xyzsparetimexyz 2 hours ago|
What about the people who dont want to be entrepreneurs?
maximedupre 1 hour ago||
They have to pivot to something else
maximedupre 1 hour ago||
Or stay ahead of the curve as long as possible, e.g. work on the loop/ralphing
philipwhiuk 4 hours ago||
> It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It's a "feel the AGI" moment to watch it struggle with something for a long time just to come out victorious 30 minutes later.

The bits left unsaid:

1. Burning tokens, which we charge you for

2. My CPU does this when I tell it to do bogosort on a million 32-bit integers, it doesn't mean it's a good thing

randoglando 2 hours ago||
Senpai has taken the words out of my mouth and put them on the page.
jopsen 1 day ago||
> - How much of society is bottlenecked by digital knowledge work?

Any qualified guesses?

I'm not convinced more traders on wall street will allocate capital more effectively leading to economic growth.

Will more programmers grow the economy? Or should we get real jobs ;)

iwontberude 7 hours ago|
Most of this countries challenges are strictly political. The pittance of work software can contribute is most likely negligible or destructive (e.g. software buttons in cars or palantir). In other words were picked all the low hanging fruit and all that left is to hang ourselves.
js8 6 hours ago|||
I actually disagree. Having software (AI) that can cut through the technological stuff faster will make people more aware of political problems.
iwontberude 5 hours ago|||
edit: country's* all that is left*
tintor 3 hours ago||
"you can review code just fine even if you struggle to write it."

Well, merely approving code takes no skill at all.

roblh 3 hours ago|
Seriously, that’s a completely nonsense line.
superze 3 hours ago||
I don't know about you guys but most of the time it's spitting nonsense models in sqlalchemy and I have to constantly correct it to the point where I am back at writing the code myself. The bugs are just astonishing and I lose control of the codebase after some time to the point where reviewing the whole thing just takes a lot of time.

On the contrary if it was for a job in a public sector I would just let the LLM spit out some output and play stupid, since salary is very low.

rschick 1 day ago||
Great point about expansion vs speedup. I now have time to build custom tools, implement more features, try out different API designs, get 100% test coverage.. I can deliver more quickly, but can also deliver more overall.
hollowturtle 7 hours ago||
> Coding workflow. Given the latest lift in LLM coding capability, like many others I rapidly went from about 80% manual+autocomplete coding and 20% agents in November to 80% agent coding and 20% edits+touchups in December

Anyone wondering what exactly is he actually building? What? Where?

> The mistakes have changed a lot - they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do.

I would LOVE to have jsut syntax errors produced by LLMs, "subtle conceptual errors that a slightly sloppy, hasty junior dev might do." are neither subtle nor slightly sloppy, they actually are serious and harmful, and no junior devs have no experience to fix those.

> They will implement an inefficient, bloated, brittle construction over 1000 lines of code and it's up to you to be like "umm couldn't you just do this instead?"

Why just not hand write 100 loc with the help of an LLM for tests, documentation and some autocomplete instead of making it write 1000 loc and then clean it up? Also very difficult to do, 1000 lines is a lot.

> Tenacity. It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day.

It's a computer program running in the cloud, what exactly did he expected?

> Speedups. It's not clear how to measure the "speedup" of LLM assistance.

See above

> 2) I can approach code that I couldn't work on before because of knowledge/skill issue. So certainly it's speedup, but it's possibly a lot more an expansion.

mmm not sure, if you don't have domain knowledge you could have an initial stubb at the problem, what when you need to iterate over it? You don't if you don't have domain knowledge on your own

> Fun. I didn't anticipate that with agents programming feels more fun because a lot of the fill in the blanks drudgery is removed and what remains is the creative part.

No it's not fun, eg LLMs produce uninteresting uis, mostly bloated with react/html

> Atrophy. I've already noticed that I am slowly starting to atrophy my ability to write code manually.

My bet is that sooner or later he will get back to coding by hand for periods of time to avoid that, like many others, the damage overreliance on these tools bring is serious.

> Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it.

No programming it's not "syntactic details" the practice of programming it's everything but "syntactic details", one should learn how to program not the language X or Y

> What happens to the "10X engineer" - the ratio of productivity between the mean and the max engineer? It's quite possible that this grows a lot.

Yet no measurable econimic effects so far

> Armed with LLMs, do generalists increasingly outperform specialists? LLMs are a lot better at fill in the blanks (the micro) than grand strategy (the macro).

Did people with a smartphone outperformed photographers?

TaupeRanger 6 hours ago||
Lots of very scared, angry developers in these comment sections recently...
Banditoz 1 hour ago|||
This is extremely reductive and incredibly dismissive of everything they wrote above.
hollowturtle 6 hours ago||||
Not angry nor scared, I value my hard skills a lot, I'm just wondering why people believe religiously everything AI related. Maybe I'm a bit sick with the excessive hype
hollowturtle 6 hours ago||||
Also note that I'm a heavy LLM user, not anti ai for sure
thr59182617 4 hours ago|||
I see way more hype that is boosted by the moderators. The scared ones are the nepo babies who founded a vaporware AI company that will be bought by daddy or friends through a VC.

They have to maintain the hype until a somewhat credible exit appears and therefore lash out with boomer memes, FOMO, and the usual insane talking points like "there are builders and coders".

simianwords 4 hours ago||
i'm not sure what kind of conspiracy you are hallucinating. do you think people have to "maintain the hype"? it is doing quite well organically.
hollowturtle 3 hours ago||
So well that they're losing billions and OpenAI may go bankrupt this year
simianwords 3 hours ago||
what if it doesn't?
hollowturtle 3 hours ago||
better for them! the heck i care about it
simianwords 4 hours ago||
This is a low quality curmudgeonly comment
hollowturtle 3 hours ago|||
Now that you contributed zero net to the discussion and learned a new word you can go out and play with toys! Good job
potatogun 4 hours ago|||
You learned a new adjective? If people move beyond "nice", "mean" and "curmudgeonly" they might even read Shakespeare instead of having an LLM producing a summary.
simianwords 4 hours ago||
cool.

>Anyone wondering what exactly is he actually building? What? Where?

this is trivially answerable. it seems like they did not do even the slightest bit of research before asking question after question to seem smart and detailed.

hollowturtle 3 hours ago||
I asked many question and you focused on only one, btw yes I did my research, and I know him because I followed almost every tutorial he has on YouTube, and he never mentions clearly what weekend project worked on to make him conclude with such claims. I had a very high respect of him if not that at some point started acting like the Jesus Christ of LLMs
simianwords 3 hours ago||
its not clear why you asked that question if you knew the answer to it?
nadis 1 day ago|
The section on IDEs/agent swarms/fallibility resonated a lot for me; I haven't gone quite as far as Karpathy in terms of power usage of Claude Code, but some of the shifts in mistakes (and reality vs. hype) analysis he shared seems spot on in my (caveat: more limited) experience.

> "IDEs/agent swarms/fallability. Both the "no need for IDE anymore" hype and the "agent swarm" hype is imo too much for right now. The models definitely still make mistakes and if you have any code you actually care about I would watch them like a hawk, in a nice large IDE on the side. The mistakes have changed a lot - they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do. The most common category is that the models make wrong assumptions on your behalf and just run along with them without checking. They also don't manage their confusion, they don't seek clarifications, they don't surface inconsistencies, they don't present tradeoffs, they don't push back when they should, and they are still a little too sycophantic. Things get better in plan mode, but there is some need for a lightweight inline plan mode. They also really like to overcomplicate code and APIs, they bloat abstractions, they don't clean up dead code after themselves, etc. They will implement an inefficient, bloated, brittle construction over 1000 lines of code and it's up to you to be like "umm couldn't you just do this instead?" and they will be like "of course!" and immediately cut it down to 100 lines. They still sometimes change/remove comments and code they don't like or don't sufficiently understand as side effects, even if it is orthogonal to the task at hand. All of this happens despite a few simple attempts to fix it via instructions in CLAUDE . md. Despite all these issues, it is still a net huge improvement and it's very difficult to imagine going back to manual coding. TLDR everyone has their developing flow, my current is a small few CC sessions on the left in ghostty windows/tabs and an IDE on the right for viewing the code + manual edits."

More comments...