Posted by jesseduffield 12/26/2025
That's completely false: Images were used for storytelling thousands of years before text (compare for instance the Lascaux paintings which are more than 17 000 years old, the Göbeklitepe sculptures and stone drawings (more than 12 000 years old), or the the more than 15 000 paintings of the City of Sefar (Algeria) which some estimate to date back as far as 20 000 years ago to the earliest text known in human history, Kish Tablet, Mesopotamia, around 3500 years old.
The image is of a monochrome logo with anti-aliased edges. Due to being a simple filled geometric shape, it could compress well with RLE, ZIP compression, or even predictors. It could even be represented as vector drawing commands (LineTo, CurveTo, etc...).
In a 1-bit-per-pixel format, a 20x20 image ends up as 400 bits (50 bytes).
And what comes to original article, there is no "text [systems]" (or there is, like there are "number [systems]", just made up). "Text" like this very thing you are reading is 2D drawing. There are no character glyphs of any kind (latin, logograms etc.) defined by universe*, they are human invented and stored/interpreted at human collective level. Computers don't know anything about text, only "numbers" of some bit width, and with those numbers a system must be created that can map some number representation to some drawing in some method (e.g. with bitmap). Also there is a lot of difference between formal/executable and natural human languages. Anyways, it's not a about some text format/encoding, it's the human/computer defined/interpreted non-linguistical meaning behind it (Wittgenstein).
* DNA/RNA can be one such "universal character glyph/string", as the "textual" information is physically constructed and interpreted.
But I can't help feel that we try to jam everything into that format because that's what's already ubiquitous. Reminds me of how every hobby OS is a copy of some Unix/Posix system.
If we had a more general structured format would we say the opposite?
- I want to learn how to climb rock walls
- I want to learn how to throw a baseball
- I want to learn how to do public speaking
- I want to learn how to play piano
- I want to make a fire in the woods
- I want to understand the emotional impact of war
- I want to be involved in my child's life
In text format no less
The 1% where something else is better?
Youtube videos that show you how to access hidden fasteners on things you want to take apart.
Not that I can't get absolutely anything open, but sometimes it's nice to be able to do so with minimal damage.
To the extent that that could work, I would imagine that I, personally, would be happy reading the textual description instead of watching the video, and for me, we'd now be even closer to text wins 100% of the time.
In other words, it's not that you _can't_ give excellent descriptions that would obviate the need for video, it's just that people _don't_, even, or perhaps even especially, when they think they do.
If someone writes text that creates a video that shows exactly how to get something apart, then _presumably_ they also watch the video to make sure it works.
So the video becomes a debugging tool for their instructions. Perhaps not as good as watching 100 people do it, but maybe even better in some ways.
So the video codec you describe could be a useful tool to help create more programmers.
https://www.commitstrip.com/en/2016/08/25/a-very-comprehensi...
You may be right, although, of course, current LLMs often do the right thing with "about 3/5ths of the way."
OTOH, as someone who has done CAD and schematic drawings by programming, I am not 100% convinced about the inevitability of unreadability.
In any case, though, the bar is not really whether any human can interpret the text, but whether the average human will interpret the text or video faster, and here, to your point, yes, the video probably still wins handily.
The closest analogy I can think of is animated math gifs like these:
https://en.wikipedia.org/wiki/User:LucasVB/Gallery
Which can be a huge aid in learning.
But this leads to another conundrum. Where do animated GIFs end and video begin? Because I could see a simple line-drawing style animated GIF being sufficient for most purposes.
Text seems worse to me. First of all, binary encodings are a superset of text encodings. But less abstractly, binary enables content-transparent compression and error correction.
Like other commenters have pointed out, the downside of binary is needing sufficient tooling. Depending on the domain, that can indeed be a downside. But if that critique isn’t relevant for a given context, it’s extremely unlikely that plaintext (ASCII?) is superior.
Text seems more like the answer to a plea for lowest common denominator of tooling.
The information-theoretic justification is that binary's efficiency assumes a perfectly known codec, but the entropy of time destroys codecs (bit rot/obsolescence). Text sacrifices transmission efficiency for semantic recovery - it remains decodable even when the specific tooling is lost, making it the most robust encoding for long-term information survival.
> the entropy of time destroys codecs (bit rot/obsolescence)
Okay, so you don't mean "entropy" in an information theoretic sense. You're just talking about the decay of time. That's a much more specific claim than your original one, and I grant than that may be true for some use-cases. But you don't need semantic recovery if you don't need to do recovery at all, i.e. if your data format and/or storage medium transparently provide redundancy and/or versioning.
This may be true if you mean text written on a physical medium (especially if it's engraved in stone or clay), but it's not true at all if you mean text stored in a computer medium. Text is just binary with a dedicated codec. Good luck interpreting Chinese plain text files after humanity has forgotten about Unicode and UTF-8.
While text-based representations may be easier to decipher than random binary data even without knowing the encoding (as in an archeological setting), it's hardly going to be the easiest. Bitmaps, for example, have a much more limited set of symbols than Unicode, so I'd bet it would be much easier to display a long lost .bmp file than a random .txt file even a few hundred years from now. Same goes for raw audio, too. Now, JPEG and MP3 might be much more difficult, because the encoding is doing much more work.