The Impossible Optimization, and the Metaprogramming to Achieve It

Posted by melodyogonna 10/28/2025

The Impossible Optimization, and the Metaprogramming to Achieve It(verdagon.dev)

73 points | 25 commentspage 2

cadamsdotcom 11/1/2025|

Seems like an optimization that could be applied quite generally - as the author mentions at the end there’s lots of places this could be used.

The problem with applying this technique generally is the amount of code generated. But what if you can optimize that too.. perhaps share the common parts of the AST between the copies of the code that are generated, and overlay the changes with some datastructure.

spectraldrift 11/1/2025||

Having never heard of mojo before, I found this article fascinating. It provides a great example of how a toy regex parser works and an excellent explanation of why vanilla regex tends to be slow. It also presents a novel solution: compiling the regex into regular code, which can then be optimized by the compiler.

convolvatron 11/1/2025|

this is literally how 'lex' works. the one written in 1987 by Vern Paxson.

jlokier 11/1/2025|||

The original is 'lex', written in 1975 by Mike Lesk and Eric Schmidt.

Yes, that Eric Schmidt, CEO of Google.

1987 was the clone, 'flex' :-)

It did "compiling the regex into regular code, which can then be optimized by the compiler" before the C programming language as we know it was created. I think 'lex' was compiling regex to C before the C language even had 'struct' types, 'printf' or 'malloc'.

spectraldrift 11/3/2025|||

So I'm only 40 years behind! It's amazing how early innovations like this seamlessly fade into the background and can be taken for granted by folks like myself.

fragmede 11/1/2025|

Depending on the userbase of the site, simply checking for @gmail.com at the end, I'd bet, would result in a quick win, as well as restricting the username's alphabet to allowed Gmail characters.

The other optimization I'd guess at would be to async/thread/process the checking before and after the @ symbol, so they can run in parallel (ish). Extra cpu time, but speed > CPU cycle count for this benchmark.

gus_massa 11/1/2025||

[Rehashing an old comment]

In the math department, we had a Moodle the students in the first year of my university in Argentina.

When we started like 15 years ago, the emails of the students and TA were evenly split in 30% Gmail, 30% Yahoo!, 30% Hotmail and 10% others (very aproxímate numbers).

Now the students have like 80% Gmail, 10% Live/Outlook/Hotmail and 10% others/Yahoo. Some of the TA are much older, so perhaps "only" 50% use Gmail.

The difference is huge. I blame the mandatory gmail account for the cell phone.

So, checking only @gmail.com is too strict, but a first fast check for @gmail.com and later the complete regex may improve the speed a lot in the real word.

bee_rider 11/1/2025|||

Maybe I am old, but I like to keep as much communication as possible going through the university email. It just feels more official somehow.

embedding-shape 11/1/2025||

Tell us you used to work at Google, without telling us.

"simply do X" is such a programmer fallacy at this point I'm surprised we don't have a catchy name for it yet, together with a XKCD for making the point extra clear.

fragmede 11/1/2025||

Tell us you don't actually work with any Google engineers... blah blah blah

The trope is "At Google we..." and then casually mention "violating" the CAP theorum with Spanner or something.

It is simple, and I really do hope any first year CS student could extract a substring from a string. Have LLMs so atrophied our programming ability that extraction of a substring is considered evidence of a superior programmer?