Top
Best
New

Posted by melodyogonna 10/28/2025

The Impossible Optimization, and the Metaprogramming to Achieve It(verdagon.dev)
73 points | 25 comments
omnicognate 11/1/2025|
The language here is Mojo, which the article seems to assume you know and doesn't say enough for you to deduce until half way through and after multiple code examples. I don't know how you're supposed to know this as even the blog it's on is mostly about Vale. From the intro I was expecting it to be about C++.
totalperspectiv 11/1/2025|
The author works for Modular. He shared the write up on the Mojo Discord. I think Mojo users were the intended audience.
Xcelerate 11/1/2025||
> Eliminate redundant matrix operations (like two transposes next to each other)

In 2016, I was trying to construct orthogonal irreducible matrix representations of various groups (“irreps”). The problem was that most of the papers describing how to construct these matrices used a recursive approach that depended on having already constructed the matrix elements of a lower dimensional irrep. Thus the irrep dimension n became quite an annoying parameter, and function calls were very slow because you had to construct the irrep for each new group element from the ground up on every single call.

I ended up using Julia’s @generated functions to dynamically create new versions of the matrix construction code for each distinct value of n for each type of group. So essentially it would generate “unrolled” code on the fly and then use LLVM to compile that a single time, after which all successive calls for a specific group and irrep dimension were extremely fast. Was really quite cool. The only downside was that you couldn’t generate very high dimensional irreps because LLVM would begin to struggle with the sheer volume of code it needed to compile, but for my project at the time that wasn’t much of a concern.

aappleby 11/1/2025||
I did something similar to this with C++ templates - it's Parsing Expression Grammar based, so not full regex, but enough for a lot of tasks:

  using sign      = Atoms<'+', '-'>;
  using digit     = Range<'0', '9'>;
  using onenine   = Range<'1', '9'>;
  using digits    = Some<digit>;
  using integer   = Seq<Opt<Atom<'-'>>, Oneof<Seq<onenine, digits>, digit>>;
  using fraction  = Seq<Atom<'.'>, digits>;
  using exponent  = Seq<Atoms<'e', 'E'>, Opt<sign>, digits>;
  using number    = Seq<integer, Opt<fraction>, Opt<exponent>>;
and I've confirmed that it does all get inlined and optimized on -O3.

JSON parser example here - https://github.com/aappleby/matcheroni/blob/main/examples/js...

Archit3ch 11/1/2025||
> Mojo, D, Nim, and Zig can do it, and C++ as of C++20. There are likely some other languages that can do it, but these are the only ones that can truly run normal run-time code at compile time

Pretty sure Julia can do it.

jburgy 11/1/2025|
https://bur.gy/2022/05/27/what-makes-julia-delightful.html confirms your hunch
lisper 11/1/2025||
> can compilers really execute general code at compile-time?

Cue the smug Lisp weenies laughing quietly in the background.

Panzerschrek 11/2/2025||
For several years I have written a dedicated compiler for regular expressions. You basically pass a regular expression and get an object file containing optimized matching code. It uses LLVM library internally to perform optimizations and machine code generation. It should be generally faster to compile compared to solutions involving constexpr-based metaprogramming.

I am surprised, that there is no programming language doing similar stuff - having regular expressions which are compiled as native code instead of just using a runtime library like PCRE2. Implementing this in C++ or Rust should be relatively easy.

lostmsu 11/2/2025|
.NET had this since 2.0 if not 1.0
abeppu 11/1/2025||
To tie this specific example to a larger framework: In scala land, Tiark Rompf's Lightweight Modular Staging system handled this class of metaprogramming elegantly, and the 'modular' part included support of multiple compilation targets. The idea was that one could incrementally define/extend DSLs that produce an IR, optimizations in that IR, and code generation for chunks of DSLs. Distinctions about stage are straight-forward type-signature changes. The worked example in this post is very similar to one of the tutorials for that system: https://scala-lms.github.io/tutorials/regex.html

Unfortunately, so far as I can tell:

- LMS has not been updated for years and never moved to scala 3. https://github.com/TiarkRompf/virtualization-lms-core

- LMS was written to also use "scala-virtualized" which is in a similar situation

There's a small project to attempt to support it with virtualization implemented in scala 3 macros, but it's missing some components: https://github.com/metareflection/scala3-lms?tab=readme-ov-f...

I'd love to see this fully working again.

taeric 11/1/2025||
Gave me a smile to see the shout out to LISP in there.

Reading this take on it, it feels like a JIT compiler could also accomplish a fair bit of this? I'm also reminded of the way a lot of older programs would generate tables during build time. I'm assuming that is still fairly common?

pfdietz 11/1/2025||
Yes, this is all straightforward with Lisp macros. Beyond that, you can call the compile function in Common Lisp and do all this at run time too.
BoingBoomTschak 11/1/2025||
In fact, there's https://github.com/telekons/one-more-re-nightmare for CL.
SuperV1234 11/1/2025|
https://github.com/hanickadot/compile-time-regular-expressio...
canucker2016 11/1/2025|
FYI: This is a C++ template version of compile time regex class.

A 54:47 presentation at CppCon 2018 is worth more than a thousand words...

see https://www.youtube.com/watch?v=QM3W36COnE4

followup CppCon 2019 video at https://www.youtube.com/watch?v=8dKWdJzPwHw

As the above github repo mentions, more info at https://www.compile-time.re/

More comments...