Posted by poisonfountain 8 hours ago
Don’t sell yourself short! Taste is not promptable, I suspect good taste is AGI-complete.
Especially in domains like fintech, there is a lot of accumulated wisdom, and that is what you’ll be handsomely paid for (for at least the next couple years :/ )
For example, architectural patterns, when you need bitemporality, immutable logs, CQRS, all these good patterns that can only be learned by owning years of system architecture - none of these feedback loops are in the training set.
And from a product design side, agents will just miss key concepts and you need a few words to prompt a fix - but that might represent a massive tree search optimization, or the agent on many cases would just fail to identify the requirement. These small steers feel small, but by evaporation our work has distilled down to just the extremely high value insights.
METR task time is still at weeks, doubling every 7 months; it’s years (assuming we keep riding this crazy exponential) until you hit multi-year tasks. I don’t see wisdom / Métis being solved in 2027.
All this said - I think it’s important to extrapolate forwards, if the trend continues, this will may all be true in 3-5 years. Now is the time to pre-register what metrics would make you worried, so that you can define your red lines. There will be a rapid consolidation of power and wealth if these tools continue on their existing growth trajectory.
I have little to add to it, except that I agree completely. Not sure what’s next
Who you belong to depend on at least two things: A) How knowledgable is the AI on what you are working on, B) How well do you wield these new tools to work better than before? (Better here can mean many different things).
Opus is getting good at architecture - I need lesser "pushbacks" either because I have learnt to say the right thing or it has learnt to do the right thing - I do not know which one.
Honestly, the only hope that the dev field has is this all being so economically inefficient that the industry as we know it collapses after the VC subsidies run out, and we’re going to pivot towards much more reasonable interventions with local models and such.
I feel that I am faster and better, sure, but trusting self perception would be an absurd thing to do.
I think the author downplays how much of that knowledge is used on knowing what to zoom in on, what to prompt, or what to look for.
I see this as a negative, the whole once everyone has everything than everyone has nothing type of argument. The company I work for believes strongly in keeping humans in control and in the loop which is something I’m grateful for but at the same time who knows how long that will last. Companies are starting to get their AI bills and realizing how much this AI usage actually costs so only time will tell but I hope, for the sake of everyone, that those with the knowledge described in this article make effort to keep their brains in shape.
> LLMs are regression-to-the-mean machines--they pull junior developers up, and drag senior developers down. Taming them requires trading the romance of 'code as craft' for the physics of manufacturing.
The thing I don't know is: how do we decide which direction is most valuable? I can see arguments in both directions--quality vs quantity, essentially. I think there's a strong argument for the value of both:
- we need more quantity of software: for a long time, the ability to write software has been locked up, confined to a closed cabal of specialists
- we need more quality in software: we depend more and more on software in every aspect of our lives, mistakes are intolerable and should be avoided
I'm lucky to work with great engineers and their productivity and code quality has become even higher. Wish that wasn't the case, but it is, and that puts also lots of pressure on myself to work more and better all the time. It's exhausting.
There are cons too, system's understanding sometimes is not as intimate, which in turn produces less "gotcha" moments that may lead to better design. There's less time to review PRs and make it a choral work.
On the other hand way more refactors and experiments can be run, so again, code quality has improved just because if you have a hunch that something could be done better, you can test it for cheap.
There's more to the quality of the output, like prompts, the quality of the codebase (from which the llms learn), the documentation/harnessing, the feedback an engineer provides while reviewing multiple times (in the chat, in the diff, in the pr) etc, etc.
I recently had Cursor evaluate a huge code base that we took over. All public stuff, nothing scary security wise, but it was so convoluted that it was taking me forever to find the bugs. It was written by a person, I should add.
I did this in cursor and after one prompt using Plan, it found all the bugs, created a plan to fix them, it looked good, and I had the agent create the fix.
It took 30 minutes.
The client had this project in the hands of another company without ai tools and they couldn’t fix the bugs she told them about.
So my point is, if we are holding on to our jobs for dear life on the basis that “code quality” matters, you might as well kick down the 4th pillar. Like I said, the LLM does not care.
Why aren't the designers and PMs shipping things if these tools are so good?