Posted by signa11 6/26/2025
When I got a bit older I realized people were doing this with performance as well. We just call this part architecture, and that part Best Practices.
And a compiler will never make sentinels for you. Most of these tricks still work and have a measurable difference.
I have made the same point several times online and in person that the famous quote is misunderstood and often suggest people take the time to go back to the source and read it since it’s a wonderful read.
I'm speaking from experience in a large code base where people liked to use the "root of all evil" argument quite a bit at one time.
That's why when converting data from one list to another I used to always initialize the latter with the known size (now I use Java streams mostly)
On the other hand there is me doing JMH microbenchmarking in code that was eventually parsing configuration files (only once at startup) which was the moment I learnt the said quote from Donald Knuth from the colleagues reviewing my code
The bit about sequential search and optimization, the topic of this whole blog post, is kind of a minor detail in the paper, despite being so eloquently phrased that it's the part everyone quotes—sometimes without even knowing what “optimization” is. There's a table of contents on its second page, which is 33 lines long, of which "A Searching Example" and "Efficiency" are lines 4 and 5. They are on pages 266–269, 3 pages of a 41-page paper. (But efficiency is an issue he considers throughout.)
Mostly the paper is about control structures, and in particular how we can structure our (imperative!) programs to permit formal proofs of correctness. C either didn't exist yet or was only known to less than five people, so its break/continue control structures were not yet the default. Knuth talks about a number of other options that didn't end up being popular.
It was a really interesting reminder of how different things looked 51 years ago. Profilers had just been invented (by Dan Ingalls, apparently?) and were still not widely available. Compilers usually didn't do register allocation. Dynamically typed languages and functional programming existed, in the form of Lisp and APL, but were far outside the mainstream because they were so inefficient. You could reliably estimate a program's speed by counting machine instructions. People were sincerely advocating using loops that didn't allow "break", and subroutines without early "return", in the interest of building up their control flow algebraically. Knuth considered a recursive search solution to the N-queens problem to be interesting enough to mention it in CACM; similarly, he explains the tail-recursion optimization as if it's not novel but at least a bit recondite, requiring careful explanation.
He mentions COBOL, BCPL, BLISS, Algol W, Algol 60, Algol 68, other Algols, PL/I (in fact including some example code), Fortran, macro assemblers, "structured assemblers", "Wirth's Pascal language [97]", Lisp, a PDP-10 Algol compiler called SAIL (?), META-II, MIXAL, PL360, and something called XPL, but not Smalltalk, CLU, APL, FORTH, or BASIC.
He points out that it would be great for languages to bundle together the set of subroutines for operating on a particular data type, as Smalltalk and CLU did, but he doesn't mention CLU; it had only been introduced in the previous year. But he's clearly thinking along those lines (p.295):
> (...) it turns out that a given level of abstraction often involves several related routines and data definitions; for example, when we decide to represent a table in a certain way, we also want to specify the routines for storing and fetching data from that table. The next generation of languages will probably take into account such related routines.
Often when I read old papers, or old software documentation like the TENEX EXEC manual, it's obvious why the paths not taken were not taken; what we ended up doing is just obviously better. This paper is not like that. Most of the alternatives mentioned seem like they might have turned out just as well as what we ended up with.
You may also want to look at the version published in Knuth's collection called Literate Programming (with corresponding errata: https://cs.stanford.edu/~knuth/lp.html#:~:text=Errata,correc... ), and I've just uploaded a scan here: https://shreevatsa.net/tmp/2025-06/DEK-P67-Structured.progra...
I don't think the measurements support conclusion that well.
What I want to have when I see those measurements:
I want language abstract machine and compiler to not get in the way I want code on certain platforms to perform. This is currently not what I get at all. The language is actively working against me. For example, I can not read cache line from an address because my object may not span long enough. The compiler has its own direction at best. This means there is no way to specify things, and all I can do is test compiler output after each minor update. In a multi-year project such updates can happen dozens of times! The ergonomics of trying to specify things actually getting worse. The example with assembly is similar to my other experiences: the compiler ignores even intrinsics now. If it wants to optimize, it does.
I can't run to someone else for magic library solutions every time I need to write code. I need to be able to get things done in a reasonable amount of time. It is my organization development process that should decide if the solution I used should be part of some library or not. It usually means that efforts that cover only some platforms and only some libraries are not that universally applicable to my platforms and my libraries as folks at language conferences think /s.
Disclaimer. I work in gamedev.