Posted by vismit2000 1/30/2026
The loss of competency seems pretty obvious but it's good to have data. What is also interesting to me is that the AI assisted group accomplished the task a bit faster but it wasn't statistically significant. Which seems to align with other findings that AI can make you 'feel' like you're working faster but that perception isn't always matched by the reality. So you're trading learning and eroding competency for a productivity boost which isn't always there.
This is up there with believing tobacco companies health "research" from the 30s, 40s, 50s, 60s, 70s, 80s, and 90s.
> We found that using AI assistance led to a statistically significant decrease in mastery. On a quiz that covered concepts they’d used just a few minutes before, participants in the AI group scored 17% lower than those who coded by hand, or the equivalent of nearly two letter grades. Using AI sped up the task slightly, but this didn’t reach the threshold of statistical significance.
This also echoes other research from a few years ago that had similar findings: https://news.ycombinator.com/item?id=46822158
Some of you are the reason why there needs to be a new luddite movement (fun fact, the luddites were completely correct in their movements; they fought against oppressive factory owners that treated their fellow humans terrible, smashing the very same machines they used themselves. Entrepreneurs were literally ushering in a new hell on Earth where their factors were killing so many orphans (because many people refused to work in such places originally, until forced by dying in the streets or dying from their labor in such places) they had to ship the bodies of children across towns to not draw suspicion). Until the entrepreneurs started killing them and convincing the king reagent to kill them with the state, they had massive support. Support so high that when suspected luddites were escaping from the "police" you could hear entire towns cheering them on helping them escape).
People rightfully hate this stuff and you refuse to see, the evidence says it's terrible but hey let's still sell it anyway what's the worse that can happen?
Or here's his more recent statements on the potential disruption from AI: https://www.cnbc.com/2026/01/27/dario-amodei-warns-ai-cause-...
Anthropic is pretty much the only major frontier AI lab that keeps saying "AI is dangerous, we should proceed with caution." It sounds like you're in violent agreement.
If your stance is AI development should not be continued at all, well, the history of Luddites should tell you what happens when an economic force meets labor concerns in a Capitalistic world.
The genie is out of the bottle and there's no putting it back. Our only choices now are to figure out how to tame it, or YOLO it and FAFO.
> this is a massive conflict of interests
I think everyone is aware of this.But people like that they aren't shying away from negative results and that builds some trust. Though let's not ignore that they're still suggesting AI + manual coding.
But honestly, this sample size is so small that we need larger studies. The results around what is effective and ineffective AI usage is a complete wash with n<8.
Also anyone else feel the paper is a bit sloppy?
I mean there's a bunch of minor things but Figure 17 (first fig in the appendix) is just kinda wild. I mean there's trivial ways to solve the glaring error. The more carefully you look at even just the figures in the paper the more you say "who the fuck wrote this?" I mean like how the fuck do you even generate Figure 12? The numbers align with the grids but boxes are shifted. And Figure 16 has experience levels shuffled for some reason. And then there are a hell of a lot more confusing stuff you'll see if you do more than a glance...
My hypothesis is that the AI users gained less in coding skill, but improved in spec/requirement writing skills.
But there’s no data, so it’s just my speculation. Intuitively, I think AI is shifting entry level programmers to focus on expressing requirements clearly, which may not be all that bad of a thing.
We're definitely getting better at writing specs. The issue is the labor bottleneck is competent senior engineers, not juniors, not PMs, not box-and-arrow staff engineers.
> I think AI is shifting entry level programmers to focus on expressing requirements clearly
This is what the TDD advocates were saying years ago.
Dramatically improved Jira usage -- better, more descriptive tickets with actionable user stories and clearly expressed requirements. Dramatically improved github PRs. Dramatically improved test coverage. Dramatically improved documentation, not just in code but in comments.
Basically all _for free_, while at the same time probably doubling or tripling our pace at closing issues, including some issues in our backlog that had lingered for months because they were annoying and nobody felt like working on them, but were easy for claude to knock out.
Even if AI goes away tomorrow, we'll still have better tooling, documentation and processes just because we HAD to implement them to use AIs more efficiently.
> Basically all _for free_
Not for free, the cost is that all of those are now written by AI so not really vetted any longer. Or do you really think your team is just using AI for code?
I wonder if we're going to have a future where the juniors never gain the skills and experience to work well by themselves, and instead become entirely reliant on AI, assuming that's the only way
If you sucked before using AI you are going to suck with AI. The compounded problem there is that you won't see just how bad you suck at what you do, because AI will obscure your perspective through its output, like an echo chamber of stupid. You are just going to suck much faster and feel better about it. Think of it as steroids for Dunning-Kruger.
This.
That's not what the study says. It says that most users reflect your statement while there is a smaller % that benefits and learns more and faster.
Generalizations are extremely dangerous.
What the article says simply reflect that most people don't care that much and default to the path of least resistance, which is common every day knowledge, but we very well know this does not apply to everyone.
> Among participants who use AI, we find a stark divide in skill formation outcomes between high-scoring interaction patterns (65%-86% quiz score) vs low-scoring interaction patterns (24%-39% quiz score). The high scorers only asked AI conceptual questions instead of code generation or asked for explanations to accompany generated code; these usage patterns demonstrate a high level of cognitive engagement.
This is very much my experience. AI is incredibly useful as a personal tutor
The LLMs have been trained on countless introductory tutorials for most popular topics, so they will provide you with a reasonable one.
Ad and friction free for now.
Enjoy it while it lasts.
Not much as a tutor.
That's not what the study says nor it is capable of credibly making that claim. You are reasoning about individuals in an RCT where subjects did not serve as their own control. The high performers in the treatment group may have done even better had they been in the control and AI is in fact is slowing them down.
You don't know which is true because you can't know because of the study design. This is why we have statistics.
The qualitative breakdown says how you use AI matters for understanding. It doesn't say some learned more than the control group and even if it did, it's not powered to show a statistical difference which is one of the only things keeping a study from not being another anecdote on the internet.
For the sake of argument let's say there is an individual in the treatment arm who scored higher than the highest control participant. What some want that to mean is, "Some engineers perform better using AI". It does not say that. That could be an objective fact(!), it doesn't matter. This study will not support it; it's an RCT. What if that programmer is just naturally gifted or lucky(!). This is the point of statistics.
The best you can do with outliers is say "AI usage didn't hinder some from attaining a high score" (again maybe it would have been higher w/o you just can't reason about individuals in a study like this).
I hope this helps.
But despite your best efforts to teach epolanski, they’ll never learn. Their comment history shows that they’re one of the MANY confidently incorrect tools on HN.
Now, imagine a scenario of a typical SWE in todays or maybe not-so-distant future: the agents build your software, you simply a gate-keeper/prompt engineer, all tests pass, you're now doing a production deployment at 12am and something happens but your agents are down. At that point, what do you do if you haven't build or even deployed the system? You're like a L1 support at this point, pretty useless and clueless when it comes to fully understanding and supporting the application .
So you know what do, what I've been doing for about a decade, if the internet goes down? I stop working. And over that time I've worked in many places around the world, developing countries, tropical islands, small huts on remote mountains. And I've lost maybe a day of work because of connectivity issues. I've been deep in a rainforest during a monsoon and still had 4g connection.
If Anthropic goes down I can switch to Gemini. If I run out of credits (people use credits? I only use a monthly subscription) then I can find enough free credits around to get some basic work done. Increasingly, I could run a local model that would be good enough for some things and that'll become even better in the future. So no, I don't think these are any kind of valid arguments. Everyone relies on online services for their work these days, for banking, messaging, office work, etc. If there's some kind of catastrophe that breaks this, we're all screwed, not just the coders who rely on LLMs.
I am genuinely curious about your work lifestyle.
The freedom to travel anywhere while working sounds awesome.
The ability to work anywhere while traveling sounds less so.
We don't have or want children but I do know people who do this with families. There's an amazing community called world schooling where people travel and arrange a month in some beautiful place around the world with other families. They'll organize teachers and activities for children and make friends with the other parents.
I've met quite a few of them - the immediate assumption people will jump to is that they must be rich. But that's not the case, they're just normal people who love to travel and have jobs that can facilitate that. And the children I've met seem happy and well adjusted.
If you tell me "I lost internet at home and couldn't work there", it's one thing. But that you simply went about a month without internet connection, I find it hard to believe.
Hell, on Tuesday I lost ~2 hours because Starlink was having some issue. When it came up I was on a different ground station and getting very low speeds. Not such a big deal except you never get that time back.
Those still have limits, no? Or if there's a subscription that provides limitless access, please tell me which one it is.
I've tried Anthropic's Max plan before, but hit limits after just a couple of hours, same with Google's stuff, but wasn't doing anything radically different when I tried those, compared with Codex, so seems other's limits are way lower.
But if I did - and I could imagine having some specific highly parallelizable work like writing a bazillion unit tests where I send out 40 subagents at a time - then the solution would be to buy two subscriptions. Not switch to API billing.
You should keep physical books, food, and medication for a SHTF scenario
"Back to Basics", "Where There Is No Doctor" and the Bible are my SHTF books
You won't be coding in a SHTF scenario.
cries on a Bavarian train
In 2022, funny enough I was at an AWS office (I worked remotely when I worked there) working in ProServe, us-east-1 was having issues that was affecting everything, guess what we all did? Stopped working, the world didn’t come to an end.
Even now that I work from home, on the rare occasions that Internet goes down, I just use my phone if I need to take a Zoom call.
The problem is not using the Internet, but being expected to use it for things where there isn't a clear domain requirement for it.
The immorality I describe is on the part of the entity expecting Internet usage, not the user.
The issue is that I paid money for my hardware to own it outright, and this expectation makes it feel like I no longer actually fully own that hardware.
I also bought my phone, but I still need a global network to make it usable
This still has nothing to do with a point of view that I have already clearly laid out multiple times.
You don’t want a “dev environment dependent on the internet”, exactly what are you going to do with your code without the internet? Just keep it on your computer?
Actually, the last thing you probably want is somebody reverting back to doing things the way we did them 20 years ago and creating a big mess. Much easier to just declare an outage and deal with it properly according to some emergency plan (you do have one, right?).
CI/CD are relatively new actually. I remember doing that stuff by hand. I.e. I compiled our system on my Desktop system, created a zip file, and then me and our operations department would use an ISDN line to upload the zip file to the server and "deploy" it by unzipping it and restarting the server. That's only 23 years ago. We had a Hudson server somewhere but it had no access to our customer infrastructure. There was no cloud.
I can still do that stuff if I need to (and I sometimes do ;-) ). But I wouldn't dream of messing with a modern production setup like that. We have CI/CD for a reason. What if CI/CD were to break? I'd fix it rather than adding to the problem by panicking and doing things manually.
Take a look at how ridiculously much money is invested in these tools and the companies behind them. Those investments expect a return somehow.
Anthropic, a common coding model provider, has said that their models generate enough cash to cover their own training costs before the next one is released. If they stopped getting massive investments, they should be able to coast with the models they have.
Your base assumption is that it is expensive and therefore these companies will eventually fail when they keep on making less money than they are spending. The reality is that they are indeed spending enormously now and making a lot of very non linear progress. At the same time a lot of that stuff is being widely published and quite a lot of it is open source. At some point you might get consolidation and maybe some companies indeed don't make it. But their core tech will survive. Investors might be crying in a corner. But that won't stop people from continuing to use the tech in some form or another.
I already have a laptop that can some modestly largish models locally. I'm not going to spend 40K or whatever on something that can run a GPT 5 class model. But it's not going to cost that in a few years either. This tech is here to stay. We might pay more or less for it. The current state is the worst it is ever going to be. It's going to be faster, bigger, better, cheaper, more useful, etc. At some point the curves flatten and people might start paying attention to cost more. Maybe don't burn a lot of gas in expensive and inefficient gas generators (as opposed to more efficient gas power plants) and maybe use cheap wind/solar instead. Maybe get some GPUs from a different vendor at a lower price? Maybe take a look at algorithm efficiencies, etc. There is a lot of room for optimization in this market. IMHO surviving companies will be making billions, will be running stuff at scale, and will be highly profitable.
Maybe some investors won't get their money back. Shit happens. That's why it's called venture capital. The web bubble bursting didn't kill the web either.
https://boto3.amazonaws.com/v1/documentation/api/latest/inde...
I don’t need to comprehend “the library”. I need to know what I need to do and then look up the API call.
You're an engineer, your goal is to figure stuff out using the best tools in front of you
Humans are resilient, they reliably perform (and throw great parties) in all sorts of chaotic conditions. Perhaps the thing that separates us most from AI is our ability to bring out our best selves when baseline conditions worsen
I'll happily optimize my life for 99.999% of the time.
If the Internet is down for a long time, I've got bigger problems anyway. Like finding food.
I don't know about you, but I don't connect to the internet most of the time, and it makes more productive, not less.
IF I was totally dependent on it, I would be in trouble. Fortunately I am not.
... Why wouldn't you build software that works there?
As I understand things, the purpose of computers is to run software.
But more importantly, let's suppose your software does require an Internet connection to function.
Why should that imply a requirement for your development environment to have one?
Why should that imply a requirement for a code generation tool to have one?
> But more importantly, let's suppose your software does require an Internet connection to function.
Because I have been able to depend on “fast” internet since 2000 both at home and at work, just like I’ve been able to depend on a compiler since 1992? There is nothing so important that can’t wait in the rare chance that internet goes out.
> Why should that imply a requirement for a code generation tool to have one
Because I don’t want to spend thousands of dollars to run a frontier model locally when I can spend $20/month and codex is included with my ChatGPT subscription?
Why would they remotely need any of that, if "to a first approximation no one wants desktop software"?
> when I can spend $20/month and codex is included with my ChatGPT subscription?
I bought the machine I'm posting from for about $1k (with some minor upgrades since then). Canadian. More than 11 years ago. And that gets me the entire computer rather than one specific cloud service.
$20/month is a lot, actually.
Even comparing to a new computer (which there is apparently still a lot of demand for): monthly charges really should be compared to a couple decades of principal, the amount you'd have to save up to yield that cash flow as a return on investment (or just interest). But even just a year or two of $20/month is hundreds of dollars. That's not insignificant, when the opportunity cost is reckoned in terms of physical goods that perform general computation.
With that $1000 computer can you run an LLM that can write code for you?
Your teachers had the right goal, but a bad argument. Learning arithmetic isn't just about being able to do a calculation. It's about getting your brain comfortable with math. If you always have to pull out a goddamn calculator, you'll be extremely limited.
Trust me, elementary-age me was dumb to not listen to those teachers and to become so calculator-dependent.
We just really underestimate sentimentality in our society because it doesn't fit our self conception.
Yes, but I don't think that is the actual bottleneck, even when they do, most children probably don't care about abstract goals, but rather about immediate skills in their everyday life, or just the statement, that they will need it.
One conclusion might be that it'd be better for some students if teachers understood the why, as they might change their approach on some subjects. An example: knowing that certain equations and patterns EXIST, and which kinds of problems they apply to, is generally much more important that knowing the actual equations by heart themselves.
What if one day you couldn't just go to the art supply store and buy a pre-stretched canvas?
It is all besides the point anyway. You are going to learn to stretch canvas by hand first because that is what my teacher made me do!"
(I jest a bit, actually agree since turning assembly->compiled code is a tighter problem space than requirements in natural language->code)
Still a terrible apples to oranges comparison.
Learn to consider whether you are mis-intepreting the person or there's something you don't know and ask questions. You literally wrote that you had an issue reading the whole comment, and dialogue would have resolved that specific misunderstanding instead of your antagonistic comment. I literally wrote "in jest" for a reason and you conveniently continue to miss it even after resolving your rendering issues. Jokes and tongue in cheeks are ways to open a conversation, not to invite lazy or stupid comments.
It's not my problem that you can't do these simple things like asking questions or reading/listening closely. And when you decide to be antagonistic about it, then you're just digging yourself into a hole (your loss and I don't feel sorry for you). Still, I don't think you understand what I was saying, but that doesn't matter because you seem to only care to vent.
You haven't contributed anything interesting to this thread. Just stop. You want to argue? Use your head, for your own sake.
You are at least a decade late to post fears about developers reliance on the internet. It was complete well before the LLM era
I use SO quite often, but it is for questions I would otherwise consult other people, because I can't figure it out short of reverse-engineering something. For actual documentation man pages and info documents are pretty awesome. Honestly I dread leaving the world of libraries shipped with my OS vendor, because the quality of documentation drops fast.
Before you go on about kids these days, my first time coding was on an Apple //e in assembly.
Copilot comparison:
Intelligence: Qwen2.5-Coder-32B is widely considered the first open-source model to reach GPT-4o and Claude 3.5 Sonnet levels of coding proficiency. While Copilot (using GPT-4o) remains highly reliable, Qwen often produces more concise code and can outperform cloud models in specific tasks like code repair.
Latency: Local execution on an M3 Max provides near-zero network latency, resulting in faster "start-to-type" responses than Copilot, which must round-trip to the cloud.
Reliability: Copilot is an all-in-one "vibe" that integrates deeply into VS Code. Qwen requires local tools like Ollama or MLX-LM and a plugin like Continue.dev to achieve the same UX.
GPT-Codex:
Intelligence & Reasoning: In recent 2025–2026 benchmarks, the Qwen3-Coder series has emerged as the strongest open-source performer, matching the "pass@5" resolution rates of flagship models like GPT-5-High. While OpenAI’s latest GPT-5.1-Codex-Max remains the overall leader in complex, project-wide autonomous engineering, Qwen is frequently cited as the better choice for local, file-specific logic.
Architecture & Efficiency: OpenAI models like GPT-OSS-20b (a Mixture-of-Experts model) are optimized for extreme speed and tool-calling. However, the M3 Max with 64GB is powerful enough to run the Qwen3-Coder-30B or 32B models at full fidelity, which provides superior logic to OpenAI's smaller "mini" or "OSS" models.
Context Window: Qwen models offer substantial context (up to 128K–256K tokens), which is comparable to OpenAI’s specialized Codex variants. This allows you to process entire modules locally without the high per-token cost of sending that data to OpenAI's servers.
What happens when github goes down. You shrug and take a long lunch.
* all services are run at a loss and they increase price to the point the corp doesn’t want to pay for everyone any more.
* it turns out that our chats are used for corporate espionage and the corps get spooked and cut access
* some dispute between EU and US happens and they cut our access.
The solution’s having EU and local models.
This is why I suggest developers use the free time they gain back writing documentation for their software (preferably in your own words not just AI slop), reading official docs, sharpening your sword, learning design patterns more thoroughly. The more you know about the code / how to code, the more you can guide the model to pick a better route for a solution.
Apply to anything else: you could eat out at restaurants every night, and it would do a great job in feeding you! Think of all the productivity you would gain relying on agential chefs. With restaurants even I can eat like a French chef, they have truly democratized food. And they do a perfect job these days executing dishes, only some mistakes.
Your pizza restaurant is all wonderful and all but what happens when the continual supply of power to the freezer breaks? How will you run your restaurant then?
i would work on the hundreds of non-coding tasks that i need to do. or just not work?
what do you do when github actions goes down?
Well, yeah. You were still (presumably) debugging the code you did write in the higher level language.
The linked article makes it very clear that the largest decline was in problem solving (debugging). The juniors starting with AI today are most definitely not going to do that problem-solving on their own.
One of my advantages(?) when it comes to using AI is that I've been the "debugger of last resort" for other people's code for over 20 years now. I've found and fixed compiler code generation bugs that were breaking application code. I'm used to working in teams and to delegating lots of code creation to teammates.
And frankly, I've reached a point where I don't want to be an expert in the JavaScript ORM of the month. It will fall out of fashion in 2 years anyway. And if it suddenly breaks in old code, I'll learn what I need to fix it. In the meantime, I need to know enough to code review it, and to thoroughly understand any potential security issues. That's it. Similarly, I just had Claude convert a bunch of Rust projects from anyhow to miette, and I definitely couldn't pass a quiz on miette. I'm OK with this.
I still develop deep expertise in brand new stuff, but I do so strategically. Does it offer a lot of leverage? Will people still be using it on greenfield projects next year? Then I'm going to learn it.
So at the current state of tech, Claude basically allows me to spend my learning strategically. I know the basics cold, and I learn the new stuff that matters.
I'd kinda like to see this measured. It's obviously not the assembly that matters for nine-9s of jobs. (I used assembly language exactly one time in my career, and that was three lines of inline in 2003.) But you develop a certain set of problem-solving skills when you code assembly. I speculate, like with most problem-solving skills, it has an impact on your overall ability and performance. Put another way, I assert nobody is worse for having learned it, so the only remaining question is, is it neutral?
> everyone's acting like all human coders are better than all AI's
I feel like the sentiment here on HN is that LLMs are better than all novices. But human coders with actual logical and architectural skills are better than LLMs. Even the super-duper AI enthusiasts talk about controlling hoards of LLMs doing their bidding--not the other way around.
And I must admit my appetite in learning new technologies has lessened dramatically in the past decade; to be fair, it gets to a point that most new ideas are just rehashing of older ones. When you know half a dozen programming languages or web frameworks, the next one takes you a couple hours to get comfortable with.
I'm a bit younger (33) but you'd be surprised how fast it comes back. I hadn't touched x86 assembly for probably 10 years at one point. Then someone asked a question in a modding community for an ancient game and after spending a few hours it mostly came back to me.
I'm sure if you had to reverse engineer some win32 applications, it'd come back quickly.
That's a skill onto itself, and I mean the general stuff does not fade or at least come back quickly. But there's a lot of the tail end that's just difficult to recall because it's obscure.
How exactly did I hook Delphi apps' TForm handling system instead of breakpointing GetWindowTextA and friends? I mean... I just cannot remember. It wasn't super easy either.
These last few months, however, I've had to spend a lot of time debugging via disassembly for my work. It felt really slow at first, but then it came back to me and now it's really natural again.
If you’ve forgotten your Win32 reverse engineering skills I’m guessing you haven’t done much of that in a long time.
That said, it’s hard to truly forget something once you’ve learned it. If you had to start doing it again today, you’d learn it much faster this time than the first.
For what it’s worth—it’s not entirely clear that this is true: https://en.wikipedia.org/wiki/Hyperthymesia
The human brain seemingly has the capability to remember (virtually?) infinite amounts of information. It’s just that most of us… don’t.
Compression/algorithms don't save you here either. The algorithm for pi is very short, pulling up any particular randomm digit of pi still requires the expenditure of some particular amount of entropy.
The important question is can you learn enough in a standard human lifetime to "fill up your knowledge bank"?
2) Hyperthymesia is about remembering specific events in your past, not about retaining conceptual knowledge.
APL inventor says that he was developing not a programming language, but notation to express as much problems as one can. He found that expressing more and more problems with the notation first made notation grow, then notation size started to shrink.
To develop conceptual knowledge (when one's "notation" starts to shrink) one has to have some good memory (re-expressing more and more problems).
You can't model systems in your mind using past experiences, at least not reliably and repeatedly.
Your lived experience is not a systematic model of anything, what this type of memory gives you is a vivid set of anecdotes describing personally important events.
Ok, so my statement is essentially correct.
Most of us can not keep infinite information in our brain.
If you moved back to a country you hadn't lived or spoken its language in for 10 years, you would find yourself that you don't have to relearn it, and it would come back quickly.
Also information is supposedly almost infinite, as with increased efficiency as you learn, it makes volume limits redundant.
I’m not sure if this is in the Wikipedia article, but when I last read about this, years ago, there seemed to be a link between Hyperthymesia and OCD. Brain scans suggested the key was in how these individuals organize the information in their brain, so that it’s easy for them retrieve.
Before the printing press was common, it was common for scholars to memorize entire books. I absolutely cannot do this. When technology made memorization less necessary, our memories shrank. Actually shrank, not merely changing what facts to focus on.
And to be clear, I would never advocate going back to the middle ages! But we did lose something.
We can “store” infinite numbers by using our numeral system as a generator of sorts for whatever the next number must be without actually having to remember infinite numbers, but I do not believe it would be physically possible to literally remember every item in some infinite set.
Sure, maybe we’ve gotten lazy about memorizing things and our true capacity is higher (maybe very much so), but there is still some limit.
Additionally, the practical limit will be very different for different people. Our brains are not all the same.
Think about how we talk about exercise. Yes, there probably is a theoretical limit to how fast any human could run, and maybe Olympic athletes are close to that, but most of us aren’t. Also, if you want your arms to get stronger, it isn’t bad to also exercise your legs; your leg muscles don’t somehow pull strength away from your arm muscles.
No, but the limiting factor is the amount of stored energy available in your body. You could exhaust your energy stores using only your legs and left barely able to use your arms (or anything else).
If we’ve offloaded our memory capacity to external means of rapid recall (ex. the internet) then what have we gained in response? Breadth of knowledge? Increased reasoning abilities? More energy for other kinds of mental work? Because there’s no cheating thermodynamics, even thinking uses energy. Or are we just simply radiating away that unused energy as heat and wasting that potential?
It has two huge benefits: nearly infinite memory for truly interesting stuff and still looking friendly to people who tell me the same stuff all the times.
Side-effect: my wife is not always happy that I forgot about "non-interesting" stuff which are still important ;-)
> When you know half a dozen programming languages or web frameworks, the next one takes you a couple hours to get comfortable with.
Learn yourself relational algebra. It invariantly will lead you to optimization problems and these will also invariantly lead you to equality saturation that is most effectively implemented with... generalized join from relational algebra!Also, relational algebra implements content-addressable storage (CAS), which is essential for data flow computing paradigm. Thus, you will have a window into CPU design.
At 54 (36 years of professional experience) I find these rondos fascinating.
I felt like that for a while, but I seem to be finding new challenges again. Lately I've been deep-diving on data pipelines and embedded systems. Sometimes I find problems that are easy enough to solve by brute force, but elegant solutions are not obvious at all. It's a lot of fun.
It could be that you're way ahead of me and I'll wind up feeling like that again.
I use remnote for that.
I write cards and quizzes for all kind of stuff, and I tend to retain it for years after having it practiced with the low friction of spaced repetition.
One take-away for us from that viewpoint was that knowledge in fact is more important than the lines of code in the repo. We'd rather lose the source code than the knowledge of our workers, so to speak.
Another point is that when you use consultants, you get lines of codes, whereas the consultancy company ends up with the knowledge!
... And so on.
So, I wholeheartedly agree that programming is learning!
Isn't this the opposite of how large tech companies operate? They can churn develops in/out very quickly, hire-to-fire, etc... but the code base lives on. There is little incentive to keep institutional knowledge. The incentives are PRs pushed and value landed.
Isn't large amounts of required institutional knowledge typically a problem?
We had domain specialists with decades of experience and knowledge, and we looked at our developers as the "glue" between domain knowledge and computation (modelling, planning and optimization software).
You can try to make this glue have little knowledge, or lots of knowledge. We chose the latter and it worked well for us.
But I was only in that one company, so I can't really tell.
People naturally try to use what they've learned but sometimes end up making things more complicated than they really needed to be. It's a regular problem even excluding the people intentionally over-complicating things for their resume to get higher paying jobs.
I could have sworn I was meant to be shipping all this time...
The models are too good now. One thing I've noticed recently is that I've stopped dreaming about tough problems, be it code or math. The greatest feeling in the world is pounding your head against a problem for a couple of days and waking up the next morning with the solution sketched out in your mind.
I don't think the solution is to be going full natty with things, but to work more alongside the code in an editor, rather than doing things in CLI.
The amount of context switching in my day-to-day work has become insane. There's this culture of “everyone should be able to do everything” (within reason, sure), but in practice it means a data scientist is expected to touch infra code if needed.
Underneath it all is an unspoken assumption that people will just lean on LLMs to make this work.
I also used to get great pleasure from the banging head and then the sudden revelation.
But that takes time. I was valuable when there was no other option. Now? Why would someone wait when an answer is just a prompt away.
They can give plausible architecture but most of the time it’s not usable if you’re starting from scratch.
When you design the system, you’re an architect not a coder, so I see no difference between handing the design to agents or other developers, you’ve done the heavy lifting.
In that perspective, I find LLMs quite useful for learning. But instead of coding, I find myself in long sessions back and forth to ask questions, requesting examples, sequence diagrams .. etc to visualise the final product.
It is a pattern matching problem and that seems to me to be something AI is/will be particularly good at.
Maybe it won’t be the perfect architecture, or the most efficient implementation. But that doesn’t seem to have stopped many companies before.
For hobby projects though, it's awesome. It just really struggles to do things right in the big codebase at work.
And how much better than palantir given that musk is a bigot, attempts to buy elections for fascists, meddles in foreign democracies to push far right extremist narratives, used his wealth to steal very sensitive data from government agencies, does Nazi salutes, trains his LLM to be racist...
> bigot
> fascist
> far right extremist
> nazi
> racist
(just pulled a few words from your small comment)
It’s not virtue signaling to say the guy throwing around nazi salutes is in fact a nazi.
Good one
And then you find out someone else had already solved it. So might as well use the Google 2.0 aka ChatGPT.
My first thought was that I can abstract what I wrote yesterday, which was a variation of what I built over the previous week. My second thought was a physiological response of fear that today is going to be a hard hyper focus day full of frustration, and that the coding agents that built this will not be able to build a modular, clean abstraction. That was followed by weighing whether it is better to have multiple one off solutions, or to manually create the abstraction myself.
I agree with you 100 percent that the poor performance of models like GPT 4 introduced some kind of regularization in the human in loop coding process.
Nonetheless, we live in a world of competition, and the people who develop techniques that give them an edge will succeed. There is a video about the evolution of technique in the high jump, the Western Roll, the Straddle Technique, and finally the Fosbury Flop. Using coding agents will be like this too.
I am working with 150 GB of time series data. There are certain pain points that need to be mitigated. For example, a different LLM model has to be coerced into analyzing or working with the data from a completely different approach in order to validate. That means instead of being 4x faster, each iteration is 4x faster, and it needs to be done twice, so it still is only 2x faster. I burned $400 in tokens in January. This cannot be good for the environment.
Timezone handling always has to be validated manually. Every exploration of the data is a train and test split. Here is the thing that hurts the most. The AI coding agents always show the top test results, not the test results of the top train results. Rather than tell me a model has no significant results, it will hide that and only present the winning outliers, which is misleading and, like the OP research suggests, very dangerous.
A lot of people are going to get burned before the techniques to mitigate this are developed.
Overfitting has always been a problem when working with data. Just because the barrier of entry for time series work is much lower does not mean that people developing the skill, whether using old school tools like ARIMA manually or having AI do the work, escape the problem of overfitting. The models will always show the happy, successful looking results.
Just like calculators are used when teaching higher math at the secondary level so basic arithmetic does not slow the process of learning math skills, AI will be used in teaching too. What we are doing is confusing techniques that have not been developed yet with not being able to acquire skills. I wrack and challenge my brain every day solving these problems. As millions of other software engineers do as well, the patterns will emerge and later become the skills taught in schools.
Ouch.
See also: https://news.ycombinator.com/item?id=46820924
> On average, participants in the AI group finished about two minutes faster, although the difference was not statistically significant. There was, however, a significant difference in test scores: the AI group averaged 50% on the quiz, compared to 67% in the hand-coding group
Common example here is learning a language. Say, you learn French or Spanish throughout your school years or on Duolingo. But unless you're lucky enough to be amazing with language skills, if you don't actually use it, you will hit a wall eventually. And similarly if you stop using language that you already know - it will slowly degrade over time.
Personally, I’ve never been learning software development concepts faster—but that’s because I’ve been offloading actual development to other people for years.
> "We collect self-reported familiarity with AI coding tools, but we do not actually measure differences in prompting techniques."
Many people drive cars without being able to explain how cars work. Or use devices like that. Or interact with people who's thinking they can't explain. Society works like that, it is functional, does not work by full understanding. We need to develop the functional part not the full understanding part. We can write C without knowing the machine code.
You can often recognize a wrong note without being able to play the piece, spot a logical fallacy without being able to construct the valid argument yourself, catch a translation error with much less fluency than producing the translation would require. We need discriminative competence, not generative.
For years I maintained a library for formatting dates and numbers (prices, ints, ids, phones), it was a pile of regex but I maintained hundreds of test cases for each type of parsing. And as new edge cases appeared, I added them to my tests, and iterated to keep the score high. I don't fully understand my own library, it emerged by scar accumulation. I mean, yes I can explain any line, but why these regexes in this order is a data dependent explanation I don't have anymore, all my edits run in loop with tests and my PRs are sent only when the score is good.
Correctness was never grounded in understanding the implementation. Correctness was grounded in the test suite.
I think being a programmer is closer to being an aircraft pilot than a car driver.
But the fundamentals all cars behave the same way all the time. Imagine running a courier company where sometimes the vehicles take a random left turn.
> Or interact with people who's thinking they can't explain
Sure but they trust those service providers because they are reliable . And the reason that they are reliable is that the service providers can explain their own thinking to themselves. Otherwise their business would be chaos and nobody would trust them.
How you approached your library was practical given the use case. But can you imagine writing a compiler like this? Or writing an industrial automation system? Not only would it be unreliable but it would be extremely slow. It's much faster to deal with something that has a consistent model that attempts to distill the essence of the problem, rather than patching on hack by hack in response to failed test after failed test.
But isn't the corrections of those errors that are valuable to society and get us a job?
People can tell they found a bug or give a description about what they want from a software, yet it requires skills to fix the bugs and to build software. Though LLMs can speedup the process, expert human judgment is still required.
If you know that you need O(n) "contains" checks and O(1) retrieval for items, for a given order of magnitude, it feels like you've all the pieces of the puzzle needed to make sure you keep the LLM on the straight and narrow, even if you didn't know off the top of your head that you should choose ArrayList.
Or if you know that string manipulation might be memory intensive so you write automated tests around it for your order of magnitude, it probably doesn't really matter if you didn't know to choose StringBuilder.
That feels different to e.g. not knowing the difference between an array list and linked list (or the concept of time/space complexity) in the first place.
When it comes to fundamentals, I think it's still worth the investment.
To paraphrase, "months of prompting can save weeks of learning".
Tests only cover cases you already know to look for. In my experience, many important edge cases are discovered by reading the implementation and noticing hidden assumptions or unintended interactions.
When something goes wrong, understanding why almost always requires looking at the code, and that understanding is what informs better tests.
Instead, just learning concepts with AI and then using HI (Human Intelligence) & AI to solve the problem at hand—by going through code line by line and writing tests - is a better approach productivity-, correctness-, efficiency-, and skill-wise.
I can only think of LLMs as fast typists with some domain knowledge.
Like typists of government/legal documents who know how to format documents but cannot practice law. Likewise, LLMs are code typists who can write good/decent/bad code but cannot practice software engineering - we need, and will need, a human for that.
> AI assistance produces significant productivity gains across professional domains, particularly for novice workers. Yet how this assistance affects the development of skills required to effectively supervise AI remains unclear. Novice workers who rely heavily on AI to complete unfamiliar tasks may compromise their own skill acquisition in the process. We conduct randomized experiments to study how developers gained mastery of a new asynchronous programming library with and without the assistance of AI. We find that AI use impairs conceptual understanding, code reading, and debugging abilities, without delivering significant efficiency gains on average. Participants who fully delegated coding tasks showed some productivity improvements, but at the cost of learning the library. We identify six distinct AI interaction patterns, three of which involve cognitive engagement and preserve learning outcomes even when participants receive AI assistance. Our findings suggest that AI-enhanced productivity is not a shortcut to competence and AI assistance should be carefully adopted into workflows to preserve skill formation -- particularly in safety-critical domains.
I assistance produces significant productivity gains across professional domains, particularly for novice workers.
We find that AI use impairs conceptual understanding, code reading, and debugging abilities, without delivering significant efficiency gains on average.
Are the two sentences talking about non-overlapping domains? Is there an important distinction between productivity and efficiency gains? Does one focus on novice users and one on experienced ones? Admittedly did not read the paper yet, might be clearer than the abstract.
The research question is: "Although the use of AI tools may improve productivity for these engineers, would they also inhibit skill formation? More specifically, does an AI-assisted task completion workflow prevent engineers from gaining in-depth knowledge about the tools used to complete these tasks?" This hopefully makes the distinction more clear.
So you can say "this product helps novice workers complete tasks more efficiently, regardless of domain" while also saying "unfortunately, they remain stupid." The introductiory lit review/context setting cites prior studies to establish "ok coders complete tasks efficiently with this product." But then they say, "our study finds that they can't answer questions." They have to say "earlier studies find that there were productivity gains" in order to say "do these gains extend to other skills? Maybe not!"
AI assistance produces significant productivity gains [...].
We find that AI use [...] [is not] delivering significant efficiency gains on average.*
While prior research found significant productivity gains, we find that AI use is not delivering significant efficiency gains on average while also impairing conceptual understanding, code reading, and debugging abilities.
I learned a lot more in a short amount of time than I would've stumbling around on my own.
Afaik its been known for a long time that the most effective way of learning a new skill, is to get private tutoring from an expert.
But that's what "impairs learning" means.
Previous title: "Anthropic: AI Coding shows no productivity gains; impairs skill development"
The previous title oversimplified the claim to "all" developers. I found the previous title meaningful while submitting this post because most of the false AI claims of "software engineer is finished" has mostly affected junior `inexperienced` engineers. But I think `junior inexperienced` was implicit which many people didn't pick.
The paper makes a more nuanced claim that AI Coding speeds up work for inexperienced developers, leading to some productivity gains at the cost of actual skill development.