A few random notes from Claude coding quite a bit last few weeks

Posted by bigwheels 1/26/2026

A few random notes from Claude coding quite a bit last few weeks(twitter.com)

https://xcancel.com/karpathy/status/2015883857489522876

911 points | 847 commentspage 7

elif 1/28/2026|

Why am I not surprised that a blog was written about LLM coding going from 20% to 80% useful, yet all of the HN comments are still nit picking about some negative details rather than building positive ideas toward some progress...

Is the programmer ego really this fragile? At least luddites had an ideological reasoning, whereas here we just seem to have emotional reflexes.

phito 1/28/2026|

It's because we see a bunch of people completely ignoring the missing 20% and flooding the world with complete slop. The push back is required to keep us sane, we need people reminding others that it's not at 100% yet even if it sometimes feels like it.

hollowturtle 1/28/2026|||

Then you have Anthropic that states on his own blog that engineers fully delegate to claude code only from 0 to 20% https://www.anthropic.com/research/how-ai-is-transforming-wo...

The fact that people keep pushing figures like 80% is total bs to me

an0malous 1/28/2026||

It’s usually people doing side projects or non-programmers who can’t tell the code is slop. None of these vibe coding evangelists ever shares the code they’re so amazed by, even though by their own logic anyone should be able to generate the same code with AI.

bob1029 1/28/2026|||

This kind of thought policing is getting to be exhausting. Perhaps we need a different kind of push back.

Do you know what my use case is? Do you know what kind of success rate I would actually achieve right now? Please show me where my missing 20% resides.

phito 1/28/2026||

Thought policing, lol. People are just sharing their perspectives, no need to take it personally. Glad it's working well for you.

cyanydeez 1/26/2026||

So I'm curious, whats the actual quality control.

Like, do these guys actually dog food real user experience, or are they all admins with the fast lane to the real model while everyone outside the org has to go through the 10 layers of model sheding, caching and other means and methods of saving money.

We all know these models are expensive as fuck to run and these companies are degrading service, A+B testing, and the rest. Do they actually ponder these things directly?

Just always seems like people are on drugs when they talk about the capabilities, and like, the drugs could be pure shit (good) or ditch weed, and we call just act like the pipeline for drugs is a consistent thing but it's really not, not at this stage where they're all burning cash through infrastructure. Definitely, like drug dealers, you know they're cutting the good stuff with low cost cached gibberish.

quinnjh 1/27/2026||

> Definitely, like drug dealers, you know they're cutting the good stuff with low cost cached gibberish.

Can confirm. My partner's chatGPT wouldnt return anything useful for her given a specific query involving web use, while i got the desired result sitting side by side. She contacted support and they said nothing they can do about it, her account is in an A/B test group without some features removed. I imagine this saves them considerable resources despite still billing customers for them.

how much this is occurring is anyones guess

bigwheels 1/27/2026||

If you access a model through an openrouter provider it might be quantized (akin to being "cut with trash"), but when you go directly to Anthropic or OpenAI you are getting access to the same APIs as everyone else. Even top-brass folks within Microsoft use Anthropic and OpenAI proper (not worth the red-tape trouble to go directly through Azure). Also, the creator and maintainer of Claude, Boris Cherny, was a bit of an oddball but one of the comparatively nicer people at Anthropic, and he indicated he primarily uses the same Anthropic APIs as everyone else (which makes sense from a product development perspective).

The underlying models are all actually really undifferentiated under the covers except for the post-training and base prompts. If you eliminate the base prompts the models behave near identically.

A conspiracy would be a helluva lot more interesting and fun, but I've spoken to these folks firsthand and it seems they already have enough challenges keeping the beast running.

tariky 1/28/2026||

I used CC in year age and it was not good. But one month ago I paid for max and started to rebuild my company web shop using it.

It is like plowing land with hand one year age and now is like I'm in brend new John Deere. It's amazing.

Of course its not perfect but if you understand code and problem it needs to solve then it works really good.

tintor 1/27/2026||

"you can review code just fine even if you struggle to write it."

Well, merely approving code takes no skill at all.

roblh 1/27/2026|

Seriously, that’s a completely nonsense line.

energy123 1/28/2026||

A big wow moment coming up is going to be GPT 5.* in Codex with Cerebras doing inference. The inference speed is going to be a big unlock, because many tasks are intrinsically serial.

It's going to feel literally like playing God, where you type in what you want and it happens ~instantly.

brcmthrowaway 1/28/2026|

When?

energy123 1/28/2026||

I don't know when but I'm going off:

- "OpenAI is partnering with Cerebras to add 750MW of ultra low-latency AI compute"

- Sam Altman saying that users want faster inference more than lower cost in his interview.

- My understanding that many tasks are serial in nature.

cactusplant7374 1/28/2026||

Speed is really important to me but also I would like higher weekly limits -- which means lower cost I suppose. Building out complex projects can take 6 months to a year on a Pro plan.

energy123 1/28/2026||

Same experience with Pro.

My trick is to attach the codebase as a txt file to 5-10 different GPT 5.2 Thinking chats, paste in the specs, and then get hard work done there, then just copy paste the final task list into codex to lower codex usage.

erelong 1/28/2026||

> 80% agent coding

A lot of these things sound cool but sometimes I'm curious what they're actually building

Like, is their bottleneck creativity now then? Are they building naything interedting or using agents to build... things that don't appeal to me, anyway?

ewidar 1/28/2026|

I guess it depends what appeal to you.

As an example finding myself in a similar 80% situation, over the last few months I built

- a personal website with my projects and poems

- an app to rework recipes in a format I like from any source (text, video,...)

- a 3d visual version of a project my nephew did for work

- a gym class finder in my area with filters the websites don't provide

- a football data game

- working on a saas for work so typical saas stuff

I was never that productive on personal projects, so this is great for me.

Also the coding part of these projects was not very appealing to me, only the output, so it fits well with AI using.

In the meanwhile I did Advent of Code as usual for the fun of code. Different objectives.

maximedupre 1/27/2026||

> It hurts the ego a bit but the power to operate over software in large "code actions" is just too net useful

It does hurt, that's why all programmers now need an entrepreneurial mindset... you become if you use your skills + new AI power to build a business.

jetsetk 1/28/2026||

That is motivational content, but not economics. Most startups will be noise, even more so than before. The value of being a founder ceases when everyone is a founder, when it becomes universal. You will need customers. Nobody wants to buy re-invented-the-wheel-74.0. It lacks character, it lacks soul. Without it, your product will be nothing but noise in a noisy world.

maximedupre 1/28/2026||

Cope. If you create something that genuinely solves a problem, people will buy no matter what.

Look entrepreneurship has never been easy. In fact it's always been one of the hardest thing ever. I'm just saying... *you don't have to do it*. Do whatever you want lol

Happy to hear what's your solution to avoid becoming totally replaceable and obsolete.

xyzsparetimexyz 1/27/2026||

What about the people who dont want to be entrepreneurs?

maximedupre 1/27/2026|||

They have to pivot to something else

maximedupre 1/27/2026||

Or stay ahead of the curve as long as possible, e.g. work on the loop/ralphing

webdevver 1/28/2026|||

permanent underclass...

shawabawa3 1/26/2026||

It's been a bit like the boiling frog analogy for me

I started by copy pasting more and more stuff in chatgpt. Then using more and more in-IDE prompting, then more and more agent tools (Claude etc). And suddenly I realise I barely hand code anymore

For sure there's still a place for manual coding, especially schemas/queries or other fiddly things where a tiny mistake gets amplified, but the vast majority of "basic work" is now just prompting, and honestly the code quality is _better_ that it was before, all kinds of refactors I didn't think about or couldn't be bothered with have almost automatically

And people still call them stochastic parrots

Macha 1/27/2026||

I've had the opposite experience, it's been a long time listening to people going "It's really good now" before it developed to a permutation that was actually worth the time to use it.

ChatGPT 3.5/4 (2023-2024): The chat interface was verbose and clunky and it was just... wrong... like 70+% of the time. Not worth using.

CoPilot autocomplete and Gitlab Duo and Junie (late 2024-early 2025): Wayyy too aggressive at guessing exactly what I wasn't doing and hijacked my tab complete when pre-LLM type-tetris autocomplete was just more reliable.

Copilot Edit/early Cursor (early 2025): Ok, I can sort of see uses here but god is picking the right files all the time such a pain as it really means I need to have figured out what I wanted to do in such detail already that what was even the point? Also the models at that time just quickly descended into incoherency after like three prompts, if it went off track good luck ever correcting it.

Copilot Agent mode / Cursor (late 2025): Ok, great, if the scope is narrowly scoped, and I'm either going to write the tests for it or it's refactoring existing code it could do something. Like something mechanical like the library has a migration where we need to replace the use of methods A/B/C and replace them with a different combination of X/Y/Z. great, it can do that. Or like CRUD controller #341. I mean, sure, if my boss is going to pay for it, but not life changing.

Zed Agent mode / Cursor agent mode / Claude code (early 2026): Finally something where I can like describe the architecture and requirements of a feature, let it code, review that code, give it written instructions on how to clean it up / refactor / missing tests, and iterate.

But that was like 2 years of "really it's better and revolutionary now" before it actually got there. Now maybe in some languages or problem domains, it was useful for people earlier but I can understand people who don't care about "but it works now" when they're hearing it for the sixth time.

And I mean, what one hand gives the other takes away. I have a decent amount of new work dealing with MRs from my coworkers where they just grabbed the requirements from a stakeholder, shoved it into Claude or Cursor and it passed the existing tests and it's shipped without much understanding. When they wrote them themselves, they tested it more and were more prepared to support it in production...

ed_mercer 1/27/2026|||

I find myself even for small work, telling CC to fix it for me is better as it usually belongs to a thread of work, and then it understands the big picture better.

phailhaus 1/26/2026||

> And people still call them stochastic parrots

Both can be true. You're tapping into every line of code publicly available, and your day-to-day really isn't that unique. They're really good at this kind of work.

arh5451 1/28/2026||

Thank you for the really excellent summation. I echo your thought 1 to 1. I have found it more difficult to learn new languages or coding skills, because I am no longer forced to go through the painful slow grind of learning.

gregjor 1/28/2026||

Painful slow grind? I have always found the learning part what I enjoy most about programming. I don't intend to outsource that a chatbot.

ed_mercer 1/28/2026||

Does one ever still need to learn new languages or coding skills if an AI will be able to do it?

dag11 1/28/2026|||

This question makes me unbelievably sad. Why should anyone learn anything?

I'm not disagreeing.

FeteCommuniste 1/28/2026|||

Probably not. But as someone who has learned a few languages, having to outsource a conversation to a machine will never not feel incredibly lame.

I doubt most people feel the same, though.

ositowang 1/27/2026|

It’s a great and insightful review—not over-hyping the coding agent, and not underestimating it either. It acknowledges both its usefulness and its limitations. Embracing it and growing with it is how I see it too.

More comments...