Posted by robotswantdata 13 hours ago
I've seen lots of AI demos that prompt "build me a TODO app", pretend that is sufficient context, and then claim that the output matches their needs. Without proper context, you can't tell if the output is correct.
"Back in the day", we had to be very sparing with context to get great results so we really focused on how to build great context. Indexing and retrieval were pretty much our core focus.
Now, even with the larger windows, I find this still to be true.
The moat for most companies is actually their data, data indexing, and data retrieval[0]. Companies that 1) have the data and 2) know how to use that data are going to win.
My analogy is this:
> The LLM is just an oven; a fantastical oven. But for it to produce a good product still depends on picking good ingredients, in the right ratio, and preparing them with care. You hit the bake button, then you still need to finish it off with presentation and decoration.
[0] https://chrlschn.dev/blog/2024/11/on-bakers-ovens-and-ai-sta...Single prompts can only get you so far (surprisingly far actually, but then they fall over quickly).
This is actually the reason I built my own chat client (~2 years ago), because I wanted to “fork” and “prune” the context easily; using the hosted interfaces was too opaque.
In the age of (working) tool-use, this starts to resemble agents calling sub-agents, partially to better abstract, but mostly to avoid context pollution.
A big textarea, you plug in your prompt, click generate, the completions are added in-line in a different color. You could edit any part, or just append, and click generate again.
90% of contemporary AI engineering these days is reinventing well understood concepts "but for LLMs", or in this case, workarounds for the self-inflicted chat-bubble UI. aistudio makes this slightly less terrible with its edit button on everything, but still not ideal.
It's surprising that many people view the current AI and large language model advancements as a significant boost in raw intelligence. Instead, it appears to be driven by clever techniques (such as "thinking") and agents built on top of a foundation of simple text completion. Notably, the core text completion component itself hasn’t seen meaningful gains in efficiency or raw intelligence recently...
I thought it would also be neat to merge contexts, by maybe mixing summarizations of key points at the merge point, but never tried.
Alchemical is "you are the world's top expert on marketing, and if you get it right I'll tip you $100, and if you get it wrong a kitten will die".
The techniques in https://www.dbreunig.com/2025/06/26/how-to-fix-your-context.... seem a whole lot more rational to me than that.
If you look at how sophisticated current LLM systems work there is so much more to this.
Just one example: Microsoft open sourced VS Code Copilot Chat today (MIT license). Their prompts are dynamically assembled with tool instructions for various tools based on whether or not they are enabled: https://github.com/microsoft/vscode-copilot-chat/blob/v0.29....
And the autocomplete stuff has a wealth of contextual information included: https://github.com/microsoft/vscode-copilot-chat/blob/v0.29....
You have access to the following information to help you make
informed suggestions:
- recently_viewed_code_snippets: These are code snippets that
the developer has recently looked at, which might provide
context or examples relevant to the current task. They are
listed from oldest to newest, with line numbers in the form
#| to help you understand the edit diff history. It's
possible these are entirely irrelevant to the developer's
change.
- current_file_content: The content of the file the developer
is currently working on, providing the broader context of the
code. Line numbers in the form #| are included to help you
understand the edit diff history.
- edit_diff_history: A record of changes made to the code,
helping you understand the evolution of the code and the
developer's intentions. These changes are listed from oldest
to latest. It's possible a lot of old edit diff history is
entirely irrelevant to the developer's change.
- area_around_code_to_edit: The context showing the code
surrounding the section to be edited.
- cursor position marked as ${CURSOR_TAG}: Indicates where
the developer's cursor is currently located, which can be
crucial for understanding what part of the code they are
focusing on.
For example, while the specifics of the prompts you're highlighting are unique to Copilot, I've basically implemented the same ideas on a project I've been working on, because it was clear from the limitations of these models that sooner rather than later it was going to be necessary to pick and choose amongst tools.
LLM "engineering" is mostly at the same level of technical sophistication that web work was back when we were using CGI with Perl -- "hey guys, what if we make the webserver embed the app server in a subprocess?" "Genius!"
I don't mean that in a negative way, necessarily. It's just...seeing these "LLM thought leaders" talk about this stuff in thinkspeak is a bit like getting a Zed Shaw blogpost from 2007, but fluffed up like SICP.
I don't think that's true.
Even if it is true, there's a big difference between "thinking about the problem" and spending months (or even years) iteratively testing out different potential prompting patterns and figuring out which are most effective for a given application.
I was hoping "prompt engineering" would mean that.
OK, well...maybe I should spend my days writing long blogposts about the next ten things that I know I have to implement, then, and I'll be an AI thought-leader too. Certainly more lucrative than actually doing the work.
Because that's literally what's happening -- I find myself implementing (or having implemented) these trendy ideas. I don't think I'm doing anything special. It certainly isn't taking years, and I'm doing it without reading all of these long posts (mostly because it's kind of obvious).
Again, it very much reminds me of the early days of the web, except there's a lot more people who are just hype-beasting every little development. Linus is over there quietly resolving SMP deadlocks, and some influencer just wrote 10,000 words on how databases are faster if you use indexes.
The goal is to design a probability distribution to solve your task by taking a complicated probability distribution and conditioning it, and the more detail you put into thinking about ("how to condition for this?" / "when to condition for that?") the better the output you'll see.
(what seems to be meant by "context" is a sequence of these conditioning steps :) )
I mean yes, duh, relevant context matters. This is why so much effort was put into things like RAG, vector DBs, prompt synthesis, etc. over the years. LLMs still have pretty abysmal context windows so being efficient matters.
LLM farts — Stochastic Wind Release.
The latest one is yet another attempt to make prompting sound like some kind of profound skill, when it’s really not that different from just knowing how to use search effectively.
Also, “context” is such an overloaded term at this point that you might as well just call it “doing stuff” — and you’d objectively be more descriptive.