Float Exposed - Hacker News

Posted by SomaticPirate 9/12/2025

Float Exposed(float.exposed)

417 points | 114 commentspage 3

dorianmariecom 9/12/2025||

not obvious you need to press enter to change the value

fnord77 9/12/2025||

assuming this is IEEE 754

KingLancelot 9/12/2025||

[dead]

GistNoesis 9/12/2025|

For a .exposed domain, it's not really shocking.

The real shocking fact about floating point is that they are even used at all.

It's throwing out of the window the most basic property operations on number should have : "associativity" and all that for a gain in dynamic range which is not necessary most of the time.

The associativity we expect to hold is (a+b)+c == a+(b+c) and (ab)c == a(bc) and these don't hold for floats even though most math formulas and compiler optimizations rely on these to hold. It's a sad miracle that everything somehow still works out OK most of the time.

You lose determinism most of the time with respect to compiler optimizations, and platform reproducibility if processor don't exactly respect IEE-754 (or is it IEE-854).

The real problem comes when you want to use parallelism. With things like atomic operations and multiple processor doing things out of order, you lose determinism and reproducibility, or add a need for synchronisation or casting operations everywhere.

Even more problematic, is that because number operations are used so often, they are set in "stone", and are implemented at the hardware level. And they use much more transistor because they are more complex than integer arithmetic.

Real programmers don't use floating points, only sloppy lazy ones do.

Real programmers use fixed point representation and make sure the bounds don't overflow/underflow unexpectedly.

Let's ban all hardware floating-point implementation : Just imagine future alien archeologists having a laugh at us when they look at our chips and think "no wonder they were doomed they can't even do a+b right : its foundations were built on sand".

imtringued 9/12/2025||

I hope nobody listens to lunatics like you. Fixed point sucks. Who on earth wants to analyze every single algorithm for its numerical stability? If you work with FPGAs, then converting a known to be working algorithm to fixed point is one of the most time consuming things you can do and it eats DSP slices like crazy. Every square or square root operation will cause a shiver to run down your spine because of the lack of dynamic range.

Edit:

For those who want more context:

https://vbn.aau.dk/ws/portalfiles/portal/494103077/WPMC22_Ko...

Here is a big fat warning:

>There is one important downside though, which relates to the fact that designing fixed-point algorithms is a significantly more complicated task as compared to similar floating-point based algorithms. This fact essentially has two major consequences: >1) The designer must have an extended mathematical knowledge about the numerical characteristics of the algorithms, and >2) The development time is in some cases longer than for equivalent floating-point systems.

the-grump 9/12/2025|||

Real programmers use an abacus and never take roots or calculate logarithms/powers.

Amazing confidence on display here.

jcranmer 9/12/2025|||

> these don't hold for floats even though most math formulas and compiler optimizations rely on these to hold.

Most compilers have an option like -fassociative-math that explicitly allows optimization under the assumption of associativity and distributivity.

> Real programmers use fixed point representation and make sure the bounds don't overflow/underflow unexpectedly.

So you complain that floating-point is bad because it's not associative but then suggest that we use fixed-point instead (which is also nonassociative), but it's okay, because it's fine as long as you do thing that programmers rarely do.

> Let's ban all hardware floating-point implementation : Just imagine future alien archeologists having a laugh at us when they look at our chips and think "no wonder they were doomed they can't even do a+b right : its foundations were built on sand".

Ah, you're the kind of person who sees that 0.1 + 0.2 != 0.3 and decides to go on a crusade against floating-point because of it. By the way, fixed point has that exact same bug: it's a fault that is caused by the base being different more than the other principles of the floating-point type.

Floating-point has trade-offs; a good programmer understands the trade-offs of floating-point and will not ask more of it than it can provide.

Findecanor 9/12/2025|||

I haven't seen compilers optimise integer `a+(b+c)` to `(a+b)+c` that much either.

I'm looking for ways to do that in my home-made compiler back-end. If you've got an example (on compiler explorer, a paper, blog or whatever), I'd be interested in reading about it.

I'd agree that fixed point would have sufficed in many cases where people use floating point. But floating point can be more convenient, with the increased range providing some protection against overflow.

GistNoesis 9/12/2025|||

These optimizations are often hidden when the compiler need to do some loop interchange optimizations.

If the order of the operations is important because you don't have associativity then you can't legally do it.

You can have special flags (-fassociative-math) for floats which allows to treat them as if they are associative but these mean your program result will depend on which optimization the compiler picked.

And it turns out that these loop reordering optimizations are really useful when you need to do some backward automatic differentiation. Because all the loops are basically iterated in reverse for the automatically generated code of the backward pass.

But the memory access pattern for the backward pass are not contiguous if you don't interchange the loop order, which the compiler can't do legally because of floats. Nor can he then merge loops together. Which is really useful because if you can merge the forward pass with the backward pass then you don't have to keep values inside a "tape".

So basically you can't rely on compiler optimizations, so your auto-differentiator can't benefit from existing compiler progress. (You can have look either at Julia Zygote, or enzyme which rely on compiler optimizations chaining well). Or you write backward passes manually.

guipsp 9/12/2025|||

You have for sure seen this in constant propagation.

Findecanor 9/12/2025||

Pruning the data-flow graph depth-first, sure, but moving edges in it is beyond anything I've read so far.

librasteve 9/12/2025|||

try rotating & shading triangles in 3D with integer math … FPUs have earned their place

GistNoesis 9/12/2025||

https://en.wikipedia.org/wiki/Fixed-point_arithmetic : allows you to have some thing which is integer math but works like floats. It's integer operations and bit shifts so really fast.

The limitation is the minimal quantization level. But for a 3d engine let's say your base increment is nanometers. Then you set your maximum dimension let's say 1000km. You only have to be able to represent number up 10^20 so 64-bit fixed point number is good enough.

Do everything in 128-bit fixed point numbers, and float are no more problem for anything scientific.

lifthrasiir 9/12/2025|||

In modern systems float ops are often as fast as corresponding integer ops, so fixed point numbers are not necessarily faster now.

librasteve 9/12/2025||||

for general computation, I think Rationals (https://raku.org) are a good choice - and Raku has big Int as standard also

nevertheless, us Weitek guys made 32-bit FPUs to do 3D graphics (pipeline, 1 instruction per clock) P754, IBM, DEC standards to power SGI, Sun etc

this is still the best format to get graphics throughout per transistor (although the architectures have got a bit more parallel)

then 64-bit became popular for CAD (32-bit means the wallpaper in your aircraft carrier might sometimes be under the surface of your wall)

Findecanor 9/12/2025||

An alternative numerical notation uses decimals but marks which digits at the end that are repeating. With enough digits, this format can represent all rational numbers that can be written in the standard numerator/denominator format.

It does of course work with base 2 and exponents as well so you could still be using floating-point format, only with additional meta-data indicating the repeating range. When a result degenerates into a number that can't fit within the number of digits, you would be left with a regular floating-point number.

I'd want to write a simple calculator that uses this numerical format but I have only been able to find algorithms for addition and subtraction. Every description I've found of the format has converted to the regular numerator/denominator form before multiplication and division.

simonask 9/12/2025|||

No 3D engine in the real world uses 64-bit coordinates. With 32-bit coordinates, you could not hope to represent things in nanometers (you'd be stuck in a cube roughly 4x4x4 meters). Realistically you might choose millimeters, but that would certainly start to produce visible artifacts.

For games and most simulation, the "soft failure" of gradual precision loss is much more desirable than the wildly wrong effects you would get from fixed-point overflow.

Etherlord87 9/12/2025|||

This kind of problem appears also with floats, just later with 32-bit floats than with 64-bit ints.

And the solution to this problem is to adjust your coordinate space, e.g. make every nanometer represented as `1` but have the containing object matrix have scale fields set to 1e-9.

So this is not a theoretical problem, just a practical one: the z-fighting you get with floats, would happen much more often with integers - you absolutely can avoid it in both cases, but practically 3D engines are designed with performance in mind, and so some assumptions lead to limitations and you would get more of them with integers.

GistNoesis 9/12/2025||||

The https://en.wikipedia.org/wiki/Z-fighting issue is the proof you often need those 64-bits.

It's kind of a chicken and egg problem where people use floats because there are FPUs available. All the engineering effort which went into dealing with floats and the problem that comes with them, would have been better invested in making integers faster.

We went onto the wrong path, and inertia keep us going on the wrong path. And now the wrong path is even more tempting because all efforts have made it more practical and almost as good. We hide the precision complexity to the programmer but it's still lurking around instead of being tamed.

The absolute GPU cluster-fuck with as many floating types as you can write on a napkin while drunk at the bar, mean that at the end of the day your neural network is non-deterministic, and you can't replicate any result from your program from 6 month ago, or last library version. Your simulations results therefore are perishable.

Inability to replicate results mean that you can't verify weight modifications to your neural networks haven't been tampered by an adversary. So you just lose all fighting chance to build a secure system.

You also can't share work in a distributed fashion because since verification is not possible you can't trust any computation that you haven't done yourself.

TinkersW 9/12/2025||

On the CPU side, yes 64 bits is a good idea, but when transferring to the GPU you simply make the camera location 0,0,0, and transform everything relative to it, thus you can easily use 32 bit float and have no z-fighting or any other precision related issues(a logarithmic depth buffer also helps).

Regarding 64 bit double vs 64 bit fixed width, I don't think there is a really good reason to bother with fixed width, it adds more instructions, and will require a custom debug visualizer to inspect the values.

Bit shifts, at least in SSE/AVX2 etc, are only able to run on a single port, so they actually aren't such a great idea(not sure about scalar, I don't bother to optimize scalar code in this way).

eska 9/12/2025|||

Regarding your second paragraph, those issues are equally catastrophic for game engines. Therefore they generally use (float x,y,z,int zone_id) to reset the origin and avoid floating point errors. Think MMOs, open world games, etc. There are talks about this from all the way back to Dungeon Siege up to Uncharted

simonask 9/12/2025||

Quantization is also precision loss.