Posted by vismit2000 10 hours ago
Later that week, now that things were working, I profiled the n^2 search. The software controlled a piece of industrial test equipment, and the actual test process would take something around 4 hours to complete. Using the very worst case, far-beyond-reasonable data set, if I left the n^2 behavior in, would have added something like 6 seconds to that 4 hour runtime.
(Ultimately I fixed it anyways, but because it was easy, not because it mattered.)
"There's a third thing [beyond speed and memory] that you might want to optimize for which is much more important than either of these, which is years of your life required per program implementation." This is of course from the perspective of a solo indie game developer, but it's a good and interesting perspective to consider.
That also means that "just do an array of flat records" is a very sane default even if it seems brutish at first.
Even storing something simple such as array of complex numbers as an structure of arrays (SoA) rather than an array of structures (AoS) can unlock a lot of optimizations. For example, less permutes/shuffles and more arithmetic instructions.
Depending on how many fields you actually need when you iterate over the data, you prevent cache pollution as well.
It's a good consideration tbh.
This is me with hash maps.
* Braid was 3 years
* Cave Story was 5 years
* World of Goo was 2 years
* Limbo was about 3 years (but with 8-16 people)
So Braid seems pretty average.
In practice what I see fail most often is not premature optimization but premature abstraction. People build elaborate indirection layers for flexibility they never need, and those layers impose real costs on every future reader of the code. The irony is that abstraction is supposed to manage complexity but prematurely applied it just creates a different kind.
This matches my experience as well.
Someone here commented once that abstractions should be emergent, not speculative, and I loved that line so much I use it with my team all the time now when I see the craziness starting.
I truly believe this comes from devs who want to feel smart by "architecting" solutions to future problems before those problems have become well defined.
I'm sure it's super flexible but the exactly same thing could have been achieved with 8 YAML files and 60% of the content between them would be identical.
Compare and contrast https://people.mpi-sws.org/~dreyer/tor/papers/wadler.pdf
If you've been monitoring properly, you buy yourself time before it becomes a problem as such, but in my experience most developers who don't anticipate load scaling also don't monitor properly.
I've seen a "senior software engineer with 20 years of industry experience" put code into production that ended up needing 30 minute timeouts for a HTTP response only 2 years after initial deployment. That is not a typo, 30 minutes. I had to take over and rewrite their "simple" code to stop the VP-level escalations our org received because of this engineering philosophy.
There is nothing to suggest you should wait to optimize under pressure, only that you should optimize only after you have measured. Benchmark tests are still best written during the development cycle, not while running hot in production.
Starting with the naive solution helps quickly ensure that your API is sensible and that your testing/benchmarking is in good shape before you start poking at the hard bits where you are much more likely to screw things up, all while offering a baseline score to prove that your optimizations are actually necessary and an improvement.
I found that over time my senses have been honed to more quickly identify things that are important to deeply study and plan right now and areas where I can skimp more and fix it later if problems develop. I don't know if there was a short cut to honing those senses that didn't involve a lot of pain as I needed to pick apart and rework oversights.
This is something I tend to consider far, far worse than "AI Slop" in practice. I always hated Microsoft Enterprise Library's Data Access Application Block (DAAB) in practice. I've literally only ever seen one product that supported multiple database backends that necessitated that level of abstraction... but I've seen that library well over a dozen times in practice. Just as a specific example.
IMO, abstractions should generally serve to make the rest of the codebase reasonable more often than not... abstractions that hide complexity are useful... abstractions that add complexity much less so.
For most of my career, sticking to rule 3 made the most sense. When the CS major would be annoying and talk about big-O they usually forgot n was tiny. But then my job changed. I started working on different things. Suddenly my job started sounding more like a leetcode interview people complain about. Now n really is big and now it really does matter.
Keep in mind that Rob Pike comes from a different era when programming for 'big iron' looked a lot more like programming for an embedded microcontroller now.
Of course there is a balance to this, the engineering time to implement both options is an important consideration. But given both algorithms are relatively easy to implement I will default to the one that is faster at large sizes even if it is slower at common sizes. I do suspect that there is an implicit assumption that "fancy" algorithms take longer and are harder to implement. But in many cases both algorithms are in the standard library and just need to be selected. If this post focused on "fancy" in terms of actual time to implement rather than speed for common sizes I would be more inclined to agree with it.
I wrote an article about this a while back: https://kevincox.ca/2023/05/09/less-than-quadratic/
Most people don't need FFT algorithm for multiplying large numbers, Karatsuba's algorithm is fine. But in some domains the difference does matter.
Personally I usually see the opposite effect - people first reach for a too-naive approach and implement some O(n^2) algorithm where it wouldn't have even been more complex to implement something O(n) or O(n log n). And n is almost always small so it works fine, until it blows up spectacularly.
To address our favorite topic: while I use LLMs to assist on coding tasks a lot, I think they're very weak at this. Claude is much more likely to suggest or expand complex control flow logic on small data types than it is to recognize and implement an opportunity to encapsulate ideas in composable chunks. And I don't buy the idea that this doesn't matter since most code will be produced and consumed by LLMs. The LLMs of today are much more effective on code bases that have already been thoughtfully designed. So are humans. Why would that change?
Having implemented my shared of highly complex high-performance algorithms in the past, the key was always to figure out how to massage the raw data into structures that allow the algorithm to fly. It requires both a decent knowledge of the various algorithm options you have, as well as being flexible to see that the data could be presented a different way to get to the same result orders of magnitude faster.
"Show me your flowchart and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowchart; it'll be obvious." -- Fred Brooks, The Mythical Man Month (1975)
I think rule 5 is often ignored by a lot of distributed services. Where you have to make several calls, each with their own http, db and "security" overhead, when one would do. Then these each end up with caching layers because they are "slow" (in aggregate).
Very few software services built today are doing it right. Most assume they need to scale from day one, pick a technology stack to enable that, and then alter the product to reflect the limitations of the tech stack they picked. Then they wonder why they need to spend millions on sales and marketing to convince people to use the product they've built, and millions on AWS bills to scale it. But then, the core problem was really that their company did not need to exist in the first place and only does because investors insist on cargo-culting the latest hot thing.
This is why software sucks so much today.
I'll add one more modification if you're like me (and apparently many others): go too far with your distribution and pull it back to a sane (i.e. small handful) number of distributed services, hopefully before you get too far down the implementation...
I've always been a KISS/DRY person but over a decade there are plenty of moments where you're tempted to reach for a fancier database or rewrite something in a trendier stack. What's actually kept things running well at scale is boring, known technologies and only optimizing in the places where it actually matters.
We wrote our principles down recently and it basically just reads like Pike's rules in different words: https://www.geocod.io/code-and-coordinates/2025-09-30-develo...
For example, I've often heard "premature optimization is the root of all evil" invoked to support opposite sides of the same argument. Pike's rules are much clearer and harder to interpret creatively.
Also, it's amusing that you don't hear this anymore:
> Rule 5 is often shortened to "write stupid code that uses smart objects".
In context, this clearly means that if you invest enough mental work in designing your data structures, it's easy to write simple code to solve your problem. But interpreted through an OO mindset, this could be seen as encouraging one of the classic noob mistakes of the heyday of OO: believing that your code could be as complex as you wanted, without cost, as long as you hid the complicated bits inside member methods on your objects. I'm guessing that "write stupid code that uses smart objects" was a snappy bit of wisdom in the pre-OO days and was discarded as dangerous when the context of OO created a new and harmful way of interpreting it.
keeping the historical chain of thinking alive is good, actually
Dean is saying (implicitly) that you can estimate performance, and therefore you can design for speed a priori - without measuring, and, indeed, before there is anything to measure.
I suspect that both authors would agree that there's a happy medium: you absolutely can and should use your knowledge to design for speed, but given an implementation of a reasonable design, you need measurement to "tune" or improve incrementally.
But e.g. if you want to do fast math, you really need to design your pipeline around cache efficiency from the beginning – it's very hard to retrofit. Whereas reducing memory allocations in order to make parallel algorithms faster is something you can usually do after profiling.
It's so true, when specing things I always try to focused on DDL because even the UI will fall into place as well, and a place I see claude opus fail as well when building things.
"Show me your flowchart and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowchart; it'll be obvious." -- Fred Brooks, The Mythical Man Month (1975)
The thing is, if you build enough of the same kinds of systems in the same kinds of domains, you can kinda tell where you should optimize ahead of time.
Most of us tend to build the same kinds of systems and usually spend a career or a good chunk of our careers in a given domain. I feel like you can't really be considered a staff/principal if you can't already tell ahead of time where the perf bottleneck will be just on experience and intuition.
I have several times done performance tests before starting a project to confirm it can be made fast enough to be viable, the entire approach can often shift depending on how quickly something can be done.
It's the same thing with programmers who believe in BDUF or disbelieve YAGNI - they design architectures for anticipated futures which do not materialize instead of evolving the architecture retrospectively in line with the future which did materialize.
I think it's a natural human foible. Gambling, for instance, probably wouldnt exist if humans' gut instincts about their ability to predict future defaulted to realistic.
This is why no matter how many brilliant programmers scream YAGNI, dont do BDUF and dont prematurely optimize there will always be some comment saying the equivalent of "akshually sometimes you should...", remembering that one time when they metaphorically rolled a double six and anticipated the necessary architecture correctly when it wasnt even necessary to do so.
These programmers are all hopped up on a different kind of roulette these days...
Don't insist on file-based data ingestion being a wrapper around a json-rpc api just because most similar things are moving that direction; what matters is whether someone has specifically asked for that for this particular system yet.
.
Not all decisions can be usefully revisited later. Sometimes you really do need to go "what if..." and make sure none of the possibilities will bite too hard. Leaving the pizza cave occasionally and making sure you (have contacts who) have some idea about the direction of the industry you're writing stuff for can help.
> Sure, don't build your system to keep audit trails until after you have questions to answer so that you know what needs to go in those audit trails...what matters is whether someone has specifically asked for that for this particular system yet.
I spent ~15 years in life sciences.You're going to build an audit trail, no matter what. There's no validated system in LS that does not have an audit trail.
It's just like e-commerce; you're going to have a cart and a checkout page. There's no point in calling that a premature optimization. Every e-commerce website has more or less the same set of flows with simply different configuration/parameters/providers.
Audit trails are commonly neglected coz somebody didnt ask the right questions, not coz somebody didnt try to anticipate the future.
Rules are "kinda" made to be broken. Be free.
I've been sticking to these rules (and will keep sticking to them) for as long as I can program (I've been doing it for the last 30 years).
IMHO, you can feel that a bottleneck is likely to occur, but you definitely can't tell where, when, or how it will actually happen.
Notice my use of the word "Novelty".
I get hired because I'm very good at building specific kinds of systems so I tend to build many variants of the same kinds of systems. They are generally not that different and the ways in which the applications perform are similar.
I do not generally write new algorithms, operating systems, nor programming languages.
I don't think this is so hard to understand the nuance of Pike's advice and what we "mortals" do in or day-to-day to earn a living.
If you'd said Plan 9 and UTF-8 I'd agree with you.
Unless you meant to imply that UNIX isn't cool.
A lot of people are learning some history today, beautiful to see.
Unix was created by Ken Thompson and Dennis Ritchie at Bell Labs (AT&T) in 1969. Thompson wrote the initial version, and Ritchie later contributed significantly, including developing the C programming language, which Unix was subsequently rewritten in.
contribute < wrote.
His credits are huge, but I think saying he wrote Unix is misattribution.
Credits include: Plan 9 (successor to Unix), Unix Window System, UTF-8 (maybe his most universally impactful contribution), Unix Philosophy Articulation, strings/greps/other tools, regular expressions, C successor work that ultimately let him to Go.