We should revisit literate programming in the agent era

Posted by horseradish 2 days ago

We should revisit literate programming in the agent era(silly.business)

288 points | 245 commentspage 5

pjmlp 1 day ago|

I rather go with formal specifications, and proofs.

s3anw3 1 day ago||

I think the tension between natural language and code is fundamentally about information compression. Code is maximally compressed intent — minimal redundancy, precise semantics. Prose is deliberately less compressed — redundant, contextual, forgiving — because human cognition benefits from that slack.

Literate programming asks you to maintain both compression levels in parallel, which has always been the problem: it's real work to keep a compressed and an uncompressed representation in sync, with no compiler to enforce consistency between them.

What's interesting about your observation is that LLMs are essentially compression/decompression engines. They're great at expanding code into prose (explaining) and condensing prose into code (implementing). The "fundamental extra labor" you describe — translating between these two levels — is exactly what they're best at.

So I agree with your conclusion: the economics have changed. The cost of maintaining both representations just dropped to near zero. Whether that makes literate programming practical at scale is still an open question, but the bottleneck was always cost, not value.

senderista 1 day ago||

The "test runbook" approach that TFA describes sounds like doctest comments in Python or Rust.

whatgoodisaroad 1 day ago||

it could be fun to make a toy compiler that takes an arbitrary literate prompt as input and uses an LLM to output a machine code executable (no intermediate structured language). could call it llmllvm. perhaps it would be tremendously dangerous

rudhdb773b 1 day ago||

I'd love to see what Tim Daly could with LLMs on Axiom's code base.

koolala 1 day ago||

Left to right APL style code seems like it could be words instead of symbols.

sublinear 2 days ago||

> This is especially important if the primary role of engineers is shifting from writing to reading.

This was always the primary role. The only people who ever said it was about writing just wanted an easy sales pitch aimed at everyone else.

Literate programming failed to take off because with that much prose it inevitably misrepresents the actual code. Most normal comments are bad enough.

It's hard to maintain any writing that doesn't actually change the result. You can't "test" comments. The author doesn't even need to know why the code works to write comments that are convincing at first glance. If we want to read lies influenced by office politics, we already have the rest of the docs.

8note 1 day ago||

> You can't "test" comments.

I'm thinking that we're approaching a world where you can both test for comments and test the comments themselves.

senderista 1 day ago||

Now that would be really interesting: prompt an LLM to find comments that misrepresent the code! I wonder how many false positives that would bring up?

ccosky 1 day ago||

I have a Claude Code skill for adding, deleting and improving comments. It does a decent job at detecting when comments are out of date with the code and updating them. It's not perfect, but it's something.

hrmtst93837 1 day ago|||

Literate programming failed because people treated long essays as the source of truth instead of testable artifacts.

Make prose runnable and minimal: focus narrative on intent and invariants, embed tiny examples as doctests or runnable notebooks, enforce them in CI so documentation regressions break the build, and gate agent-edited changes behind execution and property tests like Hypothesis and a CI job that runs pytest --doctest-modules and executes notebooks because agents produce confident-sounding text that will quietly break your API if trusted blindly.

ares623 1 day ago|||

I don't buy that. Writing is taking a bad rap from all this. Writing _is_ a form of more intense reading. Reading on steroids, as they say. If reading is considered good, writing should be considered better.

bigyabai 1 day ago||

Writing in that draft style is really only useful because a) you read the results and b) you write an improved version at the end. Drafting forever is not considered "better" because someone (usually you) has to sift through the crap to find the good parts.

This is especially pronounced in the programming workplace, where the most "senior" programmers are asked to stop programming so they can review PRs.

c0rp4s 2 days ago||

You're right that you can't test comments, but you can test the code they describe. That's what reproducibility bundles do in scientific computing ;; the prose says "we filtered variants with MAF < 0.01", and the bundle includes the exact shell command, environment, and checksums so anyone can verify the prose matches reality. The prose becomes a testable claim rather than a decorative comment. That said, I agree the failure mode of literate programming is prose that drifts from code. The question is whether agents reduce that drift enough to change the calculus.

hsaliak 1 day ago||

I explored this in std::slop (my clanker) https://github.com/hsaliak/std_slop. One of it's differentiating features of this clanker i that it only has a single tool call, run_js. The LLM produces js scripts to do it's work. Naturally, i tried to teach it to add comments for these scripts and incorporate literate programming elements. This was interesting because, every tool call now 'hydrated' some free form thinking, but it comes at output token cost.

Output Tokens are expensive! In GPT-5.4 it's ~180 dollars per Million tokens! I've settled for brief descriptions that communicate 'why' as a result. The code is documentation after all.

amelius 1 day ago||

We need an append-only programming language.

anotheryou 1 day ago|

but doesn't "the code is documentation" work better for machines?

and don't we have doc-blocks?

zdragnar 1 day ago|

Code doesn't express intent, only the implementation. Docblocks are fine for specifying local behavior, but are terrible for big picture things.

palata 1 day ago|||

Well many times it does.

bool isEven(number: Int) { return number % 2 == 0 }

I would say this expresses the intent, no need for a comment saying "check if the number is even".

Most of the code I read (at work) is not documented, still I understand the intent. In open source projects, I used to go read the source code because the documentation is inexistent or out-of-date. To the point where now I actually go directly to the source code, because if the code is well written, I can actually understand it.

zdragnar 1 day ago||

In your example, the implementation matches the intention. That is not the same thing.

bool isWeekday(number: Int) { return number % 2 == 0 }

With this small change, all we have are questions:

Is the name wrong, or the behavior? Is this a copy / paste error? Where is the specification that tells me which is right, the name or the body? Where are the tests located that should verify the expected behavior?

Did the implementation initially match the intent, but some business rule changed that necessitated a change to the implantation and the maintainer didn't bother to update the name?

Both of our examples are rather trite- I agree that I wouldn't bother documenting the local behavior of an "isEven" function. I probably would want a bit of documentation at the callsite stating why the evenness of a given number is useful to know. Generally speaking, this is why I tend to dislike docblock style comments and prefer bigger picture documentation instead- because it better captures intent.

palata 1 day ago||

I would call your example "bad code". Do you disagree with that?

zdragnar 21 hours ago||

Not at all. I'm just pointing out that code does not intrinsically convey intent, only implementation.

To use a less trite example, I'd probably find some case where a word or name can have different meanings in different contexts, and how that can be confusing rather than clarifying without further documentation or knowledge of the problem space.

Really though, any bug in the code you write is a deviation between intent and implementation. That's why documentation can be a useful supplement to code. If you haven't, take a look at the underhanded C contests- there's some fantastically good old gems in there that demonstrate how a plain reading of the code may not convey intent correctly.

The winner of this contest might be a good example: https://www.underhanded-c.org/_page_id_26.html

palata 8 hours ago||

I feel like we're going from "literate programming" to "sometimes it makes sense to add comments". I agree with the latter. Good code is mostly unsurprising, and when it is surprising it deserves a comment. But that is more the exception than the rule.

Literate programming makes it the rule.

anotheryou 1 day ago|||

right you are :)

does literate code have a place for big pic though?

More comments...