Top
Best
New

Posted by robotswantdata 6/30/2025

The new skill in AI is not prompting, it's context engineering(www.philschmid.de)
915 points | 518 commentspage 4
_pdp_ 6/30/2025|
It is wrong. The new/old skill is reverse engineering.

If the majority of the code is generated by AI, you'll still need people with technical expertise to make sense of it.

CamperBob2 6/30/2025||
Not really. Got some code you don't understand? Feed it to a model and ask it to add comments.

Ultimately humans will never need to look at most AI-generated code, any more than we have to look at the machine language emitted by a C compiler. We're a long way from that state of affairs -- as anyone who struggled with code-generation bugs in the first few generations of compilers will agree -- but we'll get there.

inspectorwadget 6/30/2025|||
>any more than we have to look at the machine language emitted by a C compiler.

Some developers do actually look at the output of C compilers, and some of them even spend a lot of time criticizing that output by a specific compiler (even writing long blog posts about it). The C language has an ISO specification, and if a compiler does not conform to that specification, it is considered a bug in that compiler.

You can even go to godbolt.org / compilerexplorer.org and see the output generated for different targets by different compilers for different languages. It is a popular tool, also for language development.

I do not know what prompt engineering will look like in the future, but without AGI, I remain skeptical about verification of different kinds of code not being required in at least a sizable proportion of cases. That does not exclude usefulness of course: for instance, if you have a case where verification is not needed; or verification in a specific case can be done efficiently and robustly by a relevant expert; or some smart method for verification in some cases, like a case where a few primitive tests are sufficient.

But I have no experience with LLMs or prompt engineering.

I do, however, sympathize with not wanting to deal with paying programmers. Most are likely nice, but for instance a few may be costly, or less than honest, or less than competent, etc. But while I think it is fine to explore LLMs and invest a lot into seeing what might come of them, I would not personally bet everything on them, neither in the short term nor the long term.

May I ask what your professional background and experience is?

CamperBob2 7/1/2025||
Some developers do actually look at the output of C compilers, and some of them even spend a lot of time criticizing that output by a specific compiler (even writing long blog posts about it). The C language has an ISO specification, and if a compiler does not conform to that specification, it is considered a bug in that compiler.

Those programmers don't get much done compared to programmers who understand their tools and use them effectively. Spending a lot of time looking at assembly code is a career-limiting habit, as well as a boring one.

I do not know what prompt engineering will look like in the future, but without AGI, I remain skeptical about verification of different kinds of code not being required in at least a sizable proportion of cases. That does not exclude usefulness of course: for instance, if you have a case where verification is not needed; or verification in a specific case can be done efficiently and robustly by a relevant expert; or some smart method for verification in some cases, like a case where a few primitive tests are sufficient.

Determinism and verifiability is something we'll have to leave behind pretty soon. It's already impossible for most programmers to comprehend (or even access) all of the code they deal with, just due to the sheer size and scope of modern systems and applications, much less exercise and validate all possible interactions. A lot of navel-gazing about fault-tolerant computing is about to become more than just philosophical in nature, and about to become relevant to more than hardware architects.

In any event, regardless of your and my opinions of how things ought to be, most working programmers never encounter compiler output unless they accidentally open the assembly window in their debugger. Then their first reaction is "WTF, how do I get out of this?" We can laugh at those programmers now, but we'll all end up in that boat before long. The most popular high-level languages in 2040 will be English and Mandarin.

May I ask what your professional background and experience is?

Probably ~30 kloc of C/C++ per year since 1991 or thereabouts. Possibly some of it running on your machine now (almost certainly true in the early 2000s but not so much lately.)

Probably 10 kloc of x86 and 6502 assembly code per year in the ten years prior to that.

But I have no experience with LLMs or prompt engineering.

May I ask why not? You and the other users who voted my post down to goatse.cx territory seem to have strong opinions on the subject of how software development will (or at least should) work going forward.

inspectorwadget 7/1/2025||
For the record, I did not downvote anyone.

>[Inspecting assembly and caring about its output]

I agree that it does not make sense for everyone to inspect generated assembly code, but for some jobs, like compiler developers, it is normal to do so, and for some other jobs it can make sense to do so occassionally. But, inspecting assembly was not my main point. My main point was that a lot of people, probably many more than those that inspect assembly code, care about the generated code. If a C compiler does not conform to the C ISO specification, a C programmer that does not inspect assembly can still decide to file a bug report, due to caring about conformance of the compiler.

The scenario you describe, as I understand it at least, of codebases where they are so complex and quality requirements are so low that inspecting code (not assembly, but the output from LLMs) is unnecessary, or mitigation strategies are sufficient, is not consistent with a lot of existing codebases, or parts of codebases. And even for very large and messy codebases, there are still often abstractions and layers. Yes, there can be abstraction leakage in systems, and fault tolerance against not just software bugs but unchecked code, can be a valuable approach. But I am not certain it would make sense to have even most code be unchecked (in the sense of having been reviewed by a programmer).

I also doubt a natural language would replace a programming language, at least if verification or AGI is not included. English and Mandarin are ambiguous. C and assembly code is (meant to be) unambiguous, and it is generally considered a significant error if a programming language is ambiguous. Without verification of some kind, or an expert (human or AGI), how could one in general cases use that code safely and usefully? There could be cases where one could do other kinds of mitigation, but there are at least a large proportion of cases where I am skeptical that sole mitigation strategies would be sufficient.

fumblingness 7/2/2025||||
"And at no point does it ever occur to you to demand proof that measures such as this will have the desired effect... or, indeed, that the desired effect is indeed worth achieving at all."

- you (https://news.ycombinator.com/item?id=44439447)

CamperBob2 7/4/2025||
(Shrug) There's a difference between prescription and prediction. I predict that after 50 years of doing the same old shit the same old way, the practice of programming is about to undergo a series of wrenching changes that amount to nothing less than revolution. Changes powered by radical new insights into the nature and function of language itself.

I'm not initiating these changes, voting for them, or attempting to persuade other people to do so, as the person I replied to in the other thread is doing. I do welcome having something new and interesting to learn and think about, though.

Anyway, always good to hear from a new fan!

rvz 6/30/2025|||
> Not really. Got some code you don't understand? Feed it to a model and ask it to add comments.

Absolutely not.

An experienced individual in their field can tell if the AI made a mistake in the comments / code rather than the typical untrained eye.

So no, actually read the code and understand what it does.

> Ultimately humans will never need to look at most AI-generated code, any more than we have to look at the machine language emitted by a C compiler.

So for safety critical systems, one should not look or check if code has been AI generated?

CamperBob2 7/1/2025||
So for safety critical systems, one should not look or check if code has been AI generated?

If you don't review the code your C compiler generates now, why not? Compiler bugs still happen, you know.

supriyo-biswas 7/1/2025|||
You do understand that LLM output is non-deterministic and tends to have a higher error ratio than compiler bugs, which do not exhibit this “feature”.

I see in one of your other posts that you were loudly grumbling about being downvoted. You may want to revisit if taking a combative, bad faith approach while replying to other people is really worth it.

CamperBob2 7/1/2025||
I see in one of your other posts that you were loudly grumbling about being downvoted. You may want to revisit if taking a combative, bad faith approach while replying to other people is really worth it.

(Shrug) Tool use is important. People who are better than you at using tools will outcompete you. That's not an opinion or "combative," whatever that means, just the way it works.

It's no skin off my nose either way, but HN is not a place where I like to see ignorant, ill-informed opinions paraded with pride.

rvz 7/1/2025|||
> If you don't review the code your C compiler generates now, why not?

That isn't a reason why you should NOT review AI-generated code. Even when comparing the two, a C compiler is far more deterministic in the code that it generates than LLMs, which are non-deterministic and unpredictable by design.

> Compiler bugs still happen, you know.

The whole point is 'verification' which is extremely important in compiler design and there exists a class of formally-verified compilers that are proven to not generate compiler bugs. There is no equivalent for LLMs.

In any case, you still NEED to check if the code's functionality matches the business requirements; AI-generated or not; especially in safety critical systems. Otherwise, it is considered as a logic bug in your implementation.

CamperBob2 7/1/2025||
If you can look at what's happening today, and imagine that code will still be generated the same way in 10-15 years as it is today, then your imagination beats mine.

99.9999% of code is not written with compilers that are "formally verified" as immune to code-generation bugs. It's not likely that any code that you and I run every day is.

rvz 7/1/2025||
> 99.9999% of code is not written with compilers that are "formally verified" as immune to code-generation bugs.

Again, that isn't a reason to never check or write tests for your code because an "AI-generated it" or even assuming that an AI will detect all of them.

In fact, it means you NEED to do more reviewing, checking and testing than ever before.

> It's not likely that any code that you and I run every day is.

So millions of phones, cars, control systems, medical devices and planes in use today aren't running formally verified code every day?

Are you sure?

CamperBob2 7/2/2025||
Yes, I'm very sure. 99.9999% of the code you are running is not formally proven to be correct, and was not generated by a compiler whose output was formally proven to be correct.

Just curious, how much time have you spent in (a) industry, (b) a CS classroom, or (c) both?

rvz 7/3/2025||
> Yes, I'm very sure.

You do understand that you are proving my entire point? It is still not a reason to *NOT* test or check your code implementation at all or to only rely on an LLM to check it for you.

What it really means is that software testing is extremely more important.

For running formally verified code every day, seL4 runs on the iPhone's security chip (secure enclave) in the hands of billions of users and it is a formally verified microkernel used for cryptographic operations from payments to disk encryption everyday.

This kernel is also used on medical devices, cars and in defense equipment, relied on by hundreds of millions of users.

> Just curious, how much time have you spent in (a) industry, (b) a CS classroom, or (c) both?

Lots of decades to know that no process developing safety critical system software would allow AI-generated code that isn't checked by a human or is only checked by other LLMs and using that as a substitute to writing tests.

adhamsalama 6/30/2025||
There is no engineering involved in using AI. It's insulting to call begging an LLM "engineering".
rednafi 6/30/2025|
This. Convincing a bullshit generator to give you the right data isn’t engineering, it quackery. But I guess “context quackery” wouldn’t sell as much.

LLMs are quite useful and I leverage them all the time. But I can’t stand these AI yappers saying the same shit over and over again in every media format and trying to sell AI usage as some kind of profound wizardry when it’s not.

mikhmha 6/30/2025|||
It is total quackery. When you zoom out in these discussions you begin to see how the AI yappers and their methodology is just modern-day alchemy with its own jargon and "esoteric" techniques.
simonw 6/30/2025||
See my comment here. These new context engineering techniques are a whole lot less quackery than the prompting techniques from last year: https://news.ycombinator.com/item?id=44428628
ModernMech 7/1/2025|||
The quackery comes in the application of these techniques, promising that they "work" without ever really showing it. Of course what's suggested in that blog sounds rational -- they're just restating common project management practices.

What makes it quackery is there's no evidence to show that these "suggestions" actually work (and how well) when it comes to using LLMs. There's no measurement, no rigor, no analysis. Just suggestions and anecdotes: "Here's what we did and it worked great for us!" It's like the self-help section of the bookstore, but now we're (as an industry) passing it off as technical content.

asadotzler 7/1/2025|||
"less"
Zopieux 6/30/2025|||
That's the definition of a hype cycle. Can't wait for tech to be past it.
retinaros 6/30/2025||
it is still sending a string of chars and hoping the model outputs something relevant. let’s not do like finance and permanently obfuscate really simple stuff to make us bigger than we are.

prompt engineering/context engineering : stringbuilder

Retrieval augmented generation: search+ adding strings to main string

test time compute: running multiple generation and choosing the best

agents: for loop and some ifs

lawlessone 6/30/2025||
I look forward to 5 million LinkedIn posts repeating this
octo888 7/1/2025||
"The other day my colleague walked up to me and said Jon, prompting is the new skill that's needed.

I laughed and told them there wrong. Here's why ->"

pyman 6/30/2025||
Someone needs to build a Chrome extension called "BS Analysis" for LinkedIn
jongjong 6/30/2025||
Recently I started work on a new project and I 'vibe coded' a test case for a complex OAuth token expiry bug entirely with AI (with Cursor), complete with mocks and stubs... And it was on someone else's project. I had no prior familiarity with the code.

That's when I understood that vibe coding is real and context is the biggest hurdle.

That said, most of the context could not be pulled from the codebase directly but came from me after asking the AI to check/confirm certain things that I suspected could be the problem.

I think vibe coding can be very powerful in the hands of a senior developer because if you're the kind of person who can clearly explain their intuitions with words, it's exactly the missing piece that the AI needs to solve the problem... And you still need to do code review aspect which is also something which senior devs are generally good at. Sometimes it makes mistakes/incorrect assumptions.

I'm feeling positive about LLMs. I was always complaining about other people's ugly code before... I HATE over-modularized, poorly abstracted code where I have to jump across 5+ different files to figure out what a function is doing; with AI, I can just ask it to read all the relevant code across all the files and tell me WTF the spaghetti is doing... Then it generates new code which 'follows' existing 'conventions' (same level of mess). The AI basically automates the most horrible aspect of the work; making sense of the complexity and churning out more complexity that works. I love it.

That said, in the long run, to build sustainable projects, I think it will require following good coding conventions and minimal 'low code' coding... Because the codebase could explode in complexity if not used carefully. Code quality can only drop as the project grows. Poor abstractions tend to stick around and have negative flow-on effects which impact just about everything.

0xfaded 7/1/2025||
At work we've licensed cursor, but as a vim holdout it's a nogo and we're otherwise somewhat restricted on what we can install.

I have 3 vim commands:

ZB $n: paste the buffer $n inside backticks along with the file path.

Z: Run the current buffer through our llm and append the output

ZI: Run the yank register through our llm and insert the output at the cursor.

The commands also pass along my AGENTS.md

Basically I'm manually building the context. One thing I really like is that when it outputs something stupid, I can just edit that part. E.g. if I ask for a plan to do something, and I don't like step 5, I can just delete it.

One humorous side effect is that without the clear chat structure, it sometimes has difficulty figuring out the end-of-stream. It can end with a question like "would you like me to ...?", answer itself yes, and keep going.

colgandev 6/30/2025||
I've been finding a ton of success lately with speech to text as the user prompt, and then using https://continue.dev in VSCode, or Aider, to supply context from files from my projects and having those tools run the inference.

I'm trying to figure out how to build a "Context Management System" (as compared to a Content Management System) for all of my prompts. I completely agree with the premise of this article, if you aren't managing your context, you are losing all of the context you create every time you create a new conversation. I want to collect all of the reusable blocks from every conversation I have, as well as from my research and reading around the internet. Something like a mashup of Obsidian with some custom Python scripts.

The ideal inner loop I'm envisioning is to create a "Project" document that uses Jinja templating to allow transclusion of a bunch of other context objects like code files, documentation, articles, and then also my own other prompt fragments, and then to compose them in a master document that I can "compile" into a "superprompt" that has the precise context that I want for every prompt.

Since with the chat interfaces they are always already just sending the entire previous conversation message history anyway, I don't even really want to use a chat style interface as much as just "one shotting" the next step in development.

It's almost a turn based game: I'll fiddle with the code and the prompts, and then run "end turn" and now it is the llm's turn. On the llm's turn, it compiles the prompt and runs inference and outputs the changes. With Aider it can actually apply those changes itself. I'll then review the code using diffs and make changes and then that's a full turn of the game of AI-assisted code.

I love that I can just brain dump into speech to text, and llms don't really care that much about grammar and syntax. I can curate fragments of documentation and specifications for features, and then just kind of rant and rave about what I want for a while, and then paste that into the chat and with my current LLM of choice being Claude, it seems to work really quite well.

My Django work feels like it's been supercharged with just this workflow, and my context management engine isn't even really that polished.

If you aren't getting high quality output from llms, definitely consider how you are supplying context.

bravesoul2 7/1/2025||
If you have a big enough following you can say the obvious and get a rapturous applause.
askonomm 7/1/2025||
So ... are we about circled back to realizing why COBOL didn't work yet? This AI magic whispering is getting real close to it just making more sense to "old-school" write programs again.
pvdebbe 7/1/2025|
The new AI winter can't come soon enough.
bsoles 7/2/2025|
> "the art of providing all the context for the task to be plausibly solvable by the LLM.”

And who is going to do that? The "context engineer", who doesn't know anything about the subject and runs to the LLM for quick answers without having any ability to evaluate if the answer is solid or not?

We saw the same story with "data scientists". A general understanding of tools with no understanding of the specific application areas is bound to result in crappy products, if not in business disasters.

More comments...