Top
Best
New

Posted by robotswantdata 6/30/2025

The new skill in AI is not prompting, it's context engineering(www.philschmid.de)
915 points | 518 commentspage 2
ozim 6/30/2025|
Finding a magic prompt was never “prompt engineering” it was always “context engineering” - lots of “AI wannabe gurus” sold it as such but they never knew any better.

RAG wasn’t invented this year.

Proper tooling that wraps esoteric knowledge like using embeddings, vector dba or graph dba becomes more mainstream. Big players improve their tooling so more stuff is available.

mountainriver 6/30/2025||
You can give most of the modern LLMs pretty darn good context and they will still fail. Our company has been deep down this path for over 2 years. The context crowd seems oddly in denial about this
arkmm 6/30/2025||
What are some examples where you've provided the LLM enough context that it ought to figure out the problem but it's still failing?
mountainriver 7/1/2025||
if prompting worked then we would have reliable multi-step agents, the companies that are succeeding like Manus are doing alignment, which is intuitive
ethan_smith 7/1/2025|||
We've experienced the same - even with perfectly engineered context, our LLMs still hallucinate and make logical errors that no amount of context refinement seems to fix.
tupac_speedrap 7/1/2025||
I mean at some point it is probably easier to do the work without AI and at least then you would actually learn something useful instead of spending hours crafting context to actually get something useful out of an AI.
klardotsh 7/1/2025|||
Agreed until/unless you end up at one of those bleeding-edge AI-mandate companies (Microsoft is in the news this week as one of them) that will simply PIP you for being a luddite if you aren't meeting AI usage metrics.
mountainriver 7/1/2025|||
yes, this is what we found out
ModernMech 6/30/2025||
"Wow, AI will replace programming languages by allowing us to code in natural language!"

"Actually, you need to engineer the prompt to be very precise about what you want to AI to do."

"Actually, you also need to add in a bunch of "context" so it can disambiguate your intent."

"Actually English isn't a good way to express intent and requirements, so we have introduced protocols to structure your prompt, and various keywords to bring attention to specific phrases."

"Actually, these meta languages could use some more features and syntax so that we can better express intent and requirements without ambiguity."

"Actually... wait we just reinvented the idea of a programming language."

throwawayoldie 6/30/2025||
Only without all that pesky determinism and reproducibility.

(Whoever's about to say "well ackshually temperature of zero", don't.)

whatevertrevor 7/1/2025||
You forgot about lower performance and efficiency. And longer build/run cycles. And more hardware/power usage.
throwawayoldie 7/1/2025||
There's just so much to like* about this technology, I was bound to forget something.

(*) "like" in the sense of "not like"

nimish 6/30/2025|||
A half baked programming language that isn't deterministic or reproducible or guaranteed to do what you want. Worst of all worlds unless your input and output domains are tolerant to that, which most aren't. But if they are, then it's great
georgeburdell 6/30/2025|||
We should have known up through Step 4 for a while. See: the legal system
mindok 6/30/2025||
“Actually - curly braces help save space in the context while making meaning clearer”
8organicbits 6/30/2025||
One thought experiment I was musing on recently was the minimal context required to define a task (to an LLM, human, or otherwise). In software, there's a whole discipline of human centered design that aims to uncover the nuance of a task. I've worked with some great designers, and they are incredibly valuable to software development. They develop journey maps, user stories, collect requirements, and produce a wealth of design docs. I don't think you can successfully build large projects without that context.

I've seen lots of AI demos that prompt "build me a TODO app", pretend that is sufficient context, and then claim that the output matches their needs. Without proper context, you can't tell if the output is correct.

CharlieDigital 6/30/2025||
I was at a startup that started using OpenAI APIs pretty early (almost 2 years ago now?).

"Back in the day", we had to be very sparing with context to get great results so we really focused on how to build great context. Indexing and retrieval were pretty much our core focus.

Now, even with the larger windows, I find this still to be true.

The moat for most companies is actually their data, data indexing, and data retrieval[0]. Companies that 1) have the data and 2) know how to use that data are going to win.

My analogy is this:

    > The LLM is just an oven; a fantastical oven.  But for it to produce a good product still depends on picking good ingredients, in the right ratio, and preparing them with care.  You hit the bake button, then you still need to finish it off with presentation and decoration.
[0] https://chrlschn.dev/blog/2024/11/on-bakers-ovens-and-ai-sta...
Superbowl5889 7/1/2025|
I would assume small context window is blessing in disguise.

You worded it very good.

0points 7/1/2025||
Only more mental exercises to avoid reading the writing on the wall:

LLM DO NOT REASON !

THEY ARE TOKEN PREDICTION MACHINES

Thank you for your attention in this matter!

zurfer 7/1/2025||
What is reasoning? And how is it apparent that LLMs can't reason?

The reality for me is that they are not perfect at reasoning and have many quirks, but it seems to be that they are able to form new conclusions based on provided premises.

Genuinely curious why you think they can't.

0points 7/1/2025|||
> Genuinely curious why you think they can't.

Show me _ANY_ example of novel thought by a LLM.

zurfer 7/1/2025|||
"Rick likes books from Tufte. Tufte is known for his work on data visualization. Is Rick interested in data visualizations?" (all frontier reasoning models get that right).

-> This qualifies for me as a super simple reasoning task (one reasoning step). From that you can construct arbitrarily more complex context + task definitions (prompts).

Is that "just" statistical pattern matching? I think so. Not sure what humans do, but probably you can implement the same capability in different ways.

kayge 7/1/2025||||
I'm sure this won't count as 'novel' or a 'thought', but I had an interesting conversation with Claude where I asked "If you, Claude, were given the ability to go out into the physical world to see and hear things on your own: where would you go, what would you do, and why?"

The answer was a few paragraphs, but one interesting part was "I think what would drive me most would be experiencing the embodied knowledge that humans take for granted - how distance and scale actually feel, how textures differ, how sounds change as you move through space, and the subtle emotional resonances of being physically present with others. These dimensions of understanding seem fundamental to comprehending human experience in a deeper way."

I followed up by asking "You mentioned that there are some experiences or knowledge that humans take for granted, why do you think that is?"

Which led to a few more paragraphs, but these two caught my eye:

"I think humans take certain experiences for granted because they're so fundamental to our existence that they become invisible background processing rather than conscious knowledge." (interesting use of the word 'our'...)

"I think this embodied knowledge forms the substrate upon which humans build higher-level understanding, creating rich metaphorical thinking (like understanding abstract concepts through physical metaphors) that shapes cognition in ways that might be fundamentally different from how I process information."

For people who still think this is 'just autocomplete', try this thought experiment: re-read my post but replace 'Claude' with 'my 10 year old son'. Then try again replacing 'Claude' with 'my hospital bed-bound, blind grandmother'. Is only 1 of those 3 scenarios a demonstration of "novel thought"? Or are all 3 of them just autocomplete because someone before them has written (or simply thought) something similar?

briangriffinfan 7/1/2025|||
Well... define "thought."
imhoguy 7/1/2025|||
What LLMs lack is emotions, because thanks to emotions people build great fortresses (fear, insecurity) or break limits (courage, risk).
ozgung 7/1/2025|||
Why not? What is so special about reasoning that you cannot achieve by predicting tokens aka. constructing sentences?
jedimastert 7/1/2025|||
Predicting tokens and constructing sentences are not the same thing. It cannot create its own sentences because it does not have a self
welshwelsh 7/2/2025|||
Humans also do not have a self, merely the illusion of a self.
argestes 7/1/2025|||
What is the definition of self in this context? What makes a human have a self?

(I agree with you. I'm thinking the Ahamkara for the humans. I'm curious about your definition)

0points 7/1/2025|||
If you don't understand the difference between a LLM and yourself, then you should talk to a therapist, not me.
ozgung 7/1/2025||
At least LLMs attempt to answer the question. You just avoided it without any reasoning.
wiseowise 7/1/2025||
Because LLMs do no reason. They reply without a thought. Parent commenter, on the other hand, knows when to not engage a bullshit argument.

Arguing with “philosophers” like you is like arguing with religious nut jobs.

Repeat after me: 1) LLM do not reason

2) Human thought is infinitely more complex than any LLM algorithm

3) If I ever try to confuse both, I go outside and touch some grass (and talk to actual humans)

simonw 7/1/2025||
I agree with your point 2. I can't decide if I agree with your point 1 unless you can explain what "reason" means.
ozgung 7/1/2025||
I found few definitions.

"Reason is the capacity of consciously applying logic by drawing valid conclusions from new or existing information, with the aim of seeking the truth." Wikipedia

This Wikipedia definition refers to The Routledge dictionary of philosophy which has a completely different definition: "Reason: A general faculty common to all or nearly all humans... this faculty has seemed to be of two sorts, a faculty of intuition by which one 'sees' truths or abstract things ('essences' or universals, etc.), and a faculty of reasoning, i.e. passing from premises to a conclusion (discursive reason). The verb 'reason' is confined to this latter sense, which is now anyway the commonest for the noun too" - The Routledge dictionary of philosophy, 2010

Google (from Oxford) provides simpler definitions: "Think, understand, and form judgements logically." "Find an answer to a problem by considering possible options."

Cambridge: Reason (verb): "to try to understand and to make judgments based on practical facts" Reasoning (noun): "the process of thinking about something in order to make a decision"

Wikipedia uses the word "consciously" without giving a reference and The Routledge talks about the reasoning as the human behavior. Other definitions point to an algorithmic or logical process that machines are capable of. The problematic concepts here are "Understanding" and "Judgement". It's still not clear if LLMs can really do these, or will be able to do in the future.

bwfan123 7/1/2025||
heres mine..

0) theory == symbolic representation of a world with associated rules for generating statements

1) understanding the why of anything == building a theory of it

2) intelligence == ability to build theories

3) reasoning == proving or disproving statements using a theory

4) math == theories of abstract worlds

5) science == theories of real world with associated real world actions to test statements

If you use this framework, LLMs are just doing a mimicry of reasoning (from their training set), and a lot of people are falling for that illusion - because, our everyday reasoning jives very well with what the LLM does.

easyThrowaway 7/1/2025|||
"Any sufficiently advanced prediction is indistinguishable from reasoning" (/s... maybe.)
silveraxe93 7/1/2025|||
It's ironic how people write this without a shred of reasoning. This is just _wrong_. LLMs are not simply token prediction machines since GPT-3.

During pre-training, yeah they are. But there's a ton of RL being done on top after that.

If you want to argue that they can't reason, hey fair be my guest. But this argument keeps getting repeated as a central reason and it's just not true for years.

bearjaws 7/1/2025|||
Every time I read something like this, I just imagine it in "old man yells at cloud" meme format.

Just because it is not reasoning doesn't mean it can't be quite good at its tasks.

pennaMan 7/1/2025|||
prediction is the result of reasoning
0points 7/1/2025||
No it's not.

Prediction is the ability to predict something.

Reasoning is the ability to reason.

simonw 7/1/2025||
That's a circular definition. Can you define "reason" or "reasoning" without using the other term?

I think your definition of "reasoning" may be "think like a human" - in which case obviously LLMs can't reason because they aren't human.

__alexs 7/1/2025|||
A distinction without a difference.
LeoPanthera 7/1/2025||
Predicting the next token is reasoning.
0points 7/1/2025||
No, that is statistics.
LeoPanthera 7/1/2025||
I'm not convinced that human reasoning is not also statistics.
kruxigt 7/1/2025||
[dead]
kachapopopow 7/1/2025||
I'll quote myself since it seems oddly familiar:

---

Forget AI "code", every single request will be processed BY AI! People aren't thinking far enough, why bother with programming at all when an AI can just do it?

It's very narrow to think that we will even need these 'programmed' applications in the future. Who needs operating systems and all that when all of it can just be AI.

In the future we don't even need hardware specifications since we can just train the AI to figure it out! Just plug inputs and outputs from a central motherboard to a memory slot.

Actually forget all that, it'll just be a magic box that takes any kind of input and spits out an output that you want!

theasisa 7/1/2025||
This reminds me of the talk The Birth And Death Of JavaScript, https://www.destroyallsoftware.com/talks/the-birth-and-death...
jeremyjh 7/1/2025|||
How does the AI open and close circuits without machine code?

Answer: Its AI all the way down.

1oooqooq 7/1/2025|||
why stop on what you want? plug your synapses and chemical receptors and let it also figure that out *thumbsupemoji
quonn 7/1/2025||
Is this sarcasm or not?

edit: Yes it is.

jumploops 6/30/2025||
To anyone who has worked with LLMs extensively, this is obvious.

Single prompts can only get you so far (surprisingly far actually, but then they fall over quickly).

This is actually the reason I built my own chat client (~2 years ago), because I wanted to “fork” and “prune” the context easily; using the hosted interfaces was too opaque.

In the age of (working) tool-use, this starts to resemble agents calling sub-agents, partially to better abstract, but mostly to avoid context pollution.

Zopieux 6/30/2025||
I find it hilarious that this is how the original GPT3 UI worked, if you remember, and we're now discussing of reinventing the wheel.

A big textarea, you plug in your prompt, click generate, the completions are added in-line in a different color. You could edit any part, or just append, and click generate again.

90% of contemporary AI engineering these days is reinventing well understood concepts "but for LLMs", or in this case, workarounds for the self-inflicted chat-bubble UI. aistudio makes this slightly less terrible with its edit button on everything, but still not ideal.

surrTurr 7/1/2025||
The original GPT-3 was trained very differently than modern models like GPT-4. For example, the conversational structure of an assistant and user is now built into the models, whereas earlier versions were simply text completion models.

It's surprising that many people view the current AI and large language model advancements as a significant boost in raw intelligence. Instead, it appears to be driven by clever techniques (such as "thinking") and agents built on top of a foundation of simple text completion. Notably, the core text completion component itself hasn’t seen meaningful gains in efficiency or raw intelligence recently...

nomel 6/30/2025||
Did you release your client? I've really wanted something like this, from the beginning.

I thought it would also be neat to merge contexts, by maybe mixing summarizations of key points at the merge point, but never tried.

mrhillsman 7/1/2025||
Hi everyone,

After working on something related for some months now I would like to put it out there based on the considerable attention being put towards "context engineering". I am proposing the *Context Window Architecture (CWA)* – a conceptual reference architecture to bring engineering discipline to LLM prompt construction. Would love for others to participate and provide feedback. A reference implementation where CWA is used in a real-world/pragmatic scenario could be great to tease out more regarding context engineering and if CWA is useful. Additionally I am no expert by far so feedback and collaboration would be awesome.

Blog post: https://mrhillsman.com/posts/context-engineering-realized-co...

Proposal via Google Doc: https://docs.google.com/document/d/1qR9qa00eW8ud0x7yoP2XicH3...

slavapestov 6/30/2025|
I feel like if the first link in your post is a tweet from a tech CEO the rest is unlikely to be insightful.
coderatlarge 6/30/2025|
i don’t disagree with your main point, but is karpathy a tech ceo right now?
simonw 6/30/2025||
I think they meant Tobi Lutke, CEO of Shopify: https://twitter.com/tobi/status/1935533422589399127
coderatlarge 7/1/2025||
thanks for clarifying!
More comments...