Top
Best
New

Posted by mittermayr 4 hours ago

Ask HN: We just had an actual UUID v4 collision...

I know what you're thinking... and I still can't believe it, but...

This morning, our database flagged a duplicate UUID (v4). I checked, thinking it may have been a double-insert bug or something, but no.

The original UUID was from a record added in 2025 (about a year ago), and today the system inserted a new document with a fresh UUIDv4 and it came up with the exact same one:

b6133fd6-70fe-4fe3-bed6-8ca8fc9386cd

We're using this: https://www.npmjs.com/package/uuid

I thought this is technically impossible, and it will never happen, and since we're not modifying the UUIDs in any way, I really wonder how that.... is possible!? We're literally only calling:

import { v4 as uuidv4 } from "uuid";

const document_id = uuidv4();

... and then insert into the database, that's it.

Additionally, the database only has about 15.000 records, and now one collision. Statistically... impossible.

Has that ever happened to anyone?! What in the...

44 points | 54 commentspage 2
mittermayr 3 hours ago|
I fully agree. It makes no sense. Yet...

The only guesses I'm having is that we originally generated UUIDv4s on a user's phone before sending it to the database, and the UUID generated this morning that collided was created on an Ubuntu server.

I don't fully know how UUIDv4s are generated and what (if anything) about the machine it's being generated on is part of the algorithm, but that's really the only change I can think of, that it used to generated on-device by users, and for many months now, has moved to being generated on server.

AntiUSAbah 2 hours ago||
You let users generate a UUID?

To be honest, the chance that you are doing something weird is probably higher than you experiencing a real UUID conflict.

How did your database 'flag' that conflict?

mittermayr 2 hours ago||
user-generated (as in: on the user's phone) was only at the very early stages of this product, and we've since moved to on-server. It's a cash-register type of app, where the same invoice must not be stored twice. So we used to generate a fresh invoice_id (uuidv4) on the user's device for each new invoice, and a double-send of that would automatically be flagged server-side (same id twice). This has since moved on to a server-only mechanism.

The database flagged it simply by having a UNIQUE key on the invoice_id column. First entry was from 2025, second entry from today.

stubish 3 hours ago|||
The UUIDv4 collision is statistically extremely unlikely. What is more likely is both systems used the same seed. This might be just a handful of bytes, increasing the chance of collision to one in billions or even millions.
lazyjones 1 hour ago||
Better check what crypto.js is actually doing in your exact setup. Weak polyfills exist...
serf 4 hours ago||
1 in 4.72 × 10²⁸

1 in 47.3 octillion.

i'd be suspecting a race condition or some other naive mistake, otherwise id be stocking up on lottery tickets.

(lol at the other user posting at the same time about the lottery ticket.. great minds and all that.)

petee 47 minutes ago|
I've always looked at it the the other way - being that lucky would mean you have even less chance of something else lucky happening, good time to save your money
sublinear 1 hour ago||
> We're using this: https://www.npmjs.com/package/uuid

Why? There's a built-in for this.

https://nodejs.org/api/crypto.html#cryptorandomuuidoptions

wg0 3 hours ago||
Would the UUID v7 be more collision proof? Hard to say because it takes time into account but then the number of entropy bits are reduced hence the UUID generated exactly at the same time have more chance of a collusion because number of entropy bits are a much smaller space hence could result in collusions more easily.

Thoughts?

AntiUSAbah 2 hours ago|
You open up every millisecond a new block. Should be even more unlikely
ares623 1 hour ago||
Buy a lottery ticket
beardyw 3 hours ago||
Just a stupid question, but why not append the date, even in seconds as hex. It's just a few bytes and would guarantee that everything OK now will be OK in the future?
flohofwoe 3 hours ago||
You can just use a different UUID variant which includes timestamp data instead (e.g. v1 or v7), there are also variants which include the MAC address.
pan69 2 hours ago|||
> but why not append the date

And use uuid v5 to hash it :)

mittermayr 3 hours ago||
yeah, any sort of additional semi-random data could've helped prevent this, I'm sure. That, however, is also kind of the idea of UUIDv4, it has lots of randomness and time built in already.
flohofwoe 3 hours ago||
UUID v4 consists of only random bits, no timestamp info.
mittermayr 2 hours ago||
oh, interesting, I didn't know that and this could possibly be part of the problem perhaps depending on what's used as the seed.
AndreyK1984 1 hour ago||
Why not to have timestamp-uuid instead ?
dgellow 1 hour ago|
How confident are you that your machines clocks are in perfect sync? What about the risk of clock drift + correction, or hardware issues?
croon 56 seconds ago||
Not GP, but: not confident. How confident would I be that an UUID collision would happen while also having a clock desync landing on the exact timestamp? Very, which is how confident I was about not encountering an UUID collision before this thread, so very++ I guess.
naikrovek 3 hours ago||
The chance of a UUIDv4 collision is very low, but it is never zero.

If everything is done properly, then this is very likely the one and only time anyone involved in the telling or reading of this account will ever experience this.

dalmo3 2 hours ago|
Classic gamblers fallacy!
ESAM_C 3 hours ago|
[dead]