i.e. optimization had violated a rule we were implicitly relying on (that each non-inlined function should start at a distinct address, so that address-to-symbol mapping could be done easily). But that’s not an explicit guarantee and optimizers don’t seem to think about it much. (Well for inlining it seems to have had some thought, still sucks, but anyway this case doesn’t fit the pattern of inlining).
I find it hard to say anyone is dead wrong in this case… but I would turn off that LTCG optimization any time I could, except where proven necessary.
This layering makes the order of the passes important and very sensitive. The passes usually don't have a grand plan, they just keep shuffling code around in different ways. A pass may only be applicable to code in a specific form created by a previous simplification pass. One pass may undo optimizations of a previous pass, or optimize-out a detail required by a later pass.
Separation into passes makes it easier to reason about correctness of each transformation in isolation, but the combined result is kinda slow and complicated.
> This is not to imply that we should get rid of SQL or get rid of query planning entirely. Rather, more explicit planning would be an additional tool in database user’s toolbelt.
I'm not sure if there was some specific part of the blog post that made you think I'm against automatic query planning altogether; if there was, please share that so that I can tweak the wording to remove that implication.
The quote from another article (which I didn't read) starting with "I dislike query planners".
"Against ... altogether" is mildly stronger than I took away from this, more like "generally of the opinion that the tradeoff nearly everyone is making with sql isn't worth it".
Judging by the lack of upvotes other people didn't react as strongly to this quote as I did, so take it as you will.
I'm surprised that the "query planner" doesn't have a way to eject an opaque object that is the "assembly language of the query" that you can run that it is not allowed to change.
> LLVM supports an interesting feature called Optimization Remarks – these remarks track whether an optimization was performed or missed. Clang support recording remarks using -fsave-optimization-record and Rustc supports -Zremark-dir=<blah>. There are also some tools (opt-viewer.py, optview2) to help view and understand the output.
Julia is by far the worst language about this. It would be vastly more usable with the addition of @assert_type_stable, @assert_doesn’t_allocate, and @assert_doesn’t_error macros.
I agree that Julia takes the idea of optimization to the extreme - it's semantically a very dynamic language and only fast due to non-semantically guaranteed optimization. On the other hand, getting access to the generated IR, LLVM and assembly and iteratively improving it is far easier than any other language I've seen.
> Have a good mental model of what the optimizer can and cannot do.
Most DB query planner designers and implementers have little imagination, and their mental model of what optimizers can and cannot do is, well, extremely narrow-minded. There is huge unexplored space of what query planning can be (at least for analytic queries, and we think in columnar terms) - if we just stop insisting on thinking of DBMS operations as black boxes.