Top
Best
New

Posted by todsacerdoti 6 days ago

Faster practical modular inversion(purplesyringa.moe)
16 points | 3 comments
fn-mote 10 minutes ago|
The article is quite interesting. It ranges from a mathematical discussion (it explains Stein’s variant of the GCD algorithm if you make it far enough) to instruction-set-level issues.

For example:

> GCC documents __builtin_clz(0) as having an “undefined result”, so I initially assumed it means an indeterminate value. In reality, GCC maintainers consider it UB and LLVM documents it as UB… but the optimizers seem to model it exactly like an indeterminate value? (e.g. LLVM considers @llvm.cttz(0) to produce poison) This is frankly ridiculous, someone do something about it

There is a very interesting table of raw timings, showing variation from about 30% faster to 10% slower across a variety of bit widths (of the numbers involved) and architectures. This is a VERY thorough level of testing!!

PS It’s not a dupe if no discussion occurred… that’s just the second chance pool.

signa11 56 minutes ago|
dupe: https://news.ycombinator.com/item?id=46344548
867-5309 12 minutes ago||
anachronism: