Top
Best
New

Posted by todsacerdoti 10/27/2024

Using less memory to look up IP addresses in Mess With DNS(jvns.ca)
137 points | 41 commentspage 2
gmuslera 10/27/2024|
The trie approach seemed to be a bit plain. Tries are particularly good for storing netblocks, and see if a particular IP belongs to some block stored there. But is better to use binary values instead of strings, and to use patricia or radix trees as they might compress strings of bits without branches with a single node. And, anyway, postgresql already have the netblock type with efficient indexes for them.

But if that is used for individual IPs, without worrying about blocks they belong to, probably won't get big gains in that area.

ignoramous 10/27/2024|
For hierarchy like IP addresses, I've seen programs use specialized data structures like CritBit Tries [0] and Allotment Routing Tables [1].

[0] https://news.ycombinator.com/item?id=3015246

[1] https://github.com/openbsd/src/blob/2bd42e97200bee/sys/net/a...

Timber-6539 10/27/2024||
Had the same issue with restic taking up a lock in the event of an unsuccessful operation. I have a very simple workaround for it.

ExecStartPre=/usr/bin/restic unlock

ExecStart=/usr/bin/restic backup

LtdJorge 10/27/2024|
That has the benefit of Systemd not letting more than one instance run at the same time, which I'm guessing is one of the reasons the lock was needed in the first place.

You just have to make sure that the type of Service used is correct, so that Systemd can track whether Restic has actually stopped running.

wvh 10/28/2024||
SQL is not going to be of much help when the dataset easily fits into memory, but I'm a bit surprised by the lack of performance of the trie. I guess the overhead is too much when the dataset has a lot of single IP addresses (only one byte difference), but I imagine it would be helpful for sparse IPv6 addresses.

Go has an excellent standard library, but the solutions in there rarely compete with others writing a dedicated library to solve a hard problem they really had to solve.

1a527dd5 10/29/2024||
I wonder if DuckDB could do better than SQLite?
reincoder 11/2/2024|
Tried it. I think it is marginally better than SQLite, but it is not the best solution.

I work for IPinfo and was quite excited when they added support for the IP address data type. However, at the end of the day, the most efficient lookup mechanism has to be the MMDB database. Rather doing the enrichment of IP addresses inside of DuckDB, the better solution was doing an enrichment outside of DuckDB using the MMDB database, dump it on a csv and inserting the CSV in DuckDB as a table.

See the top comment of the thread for more context around MMDB.

1a527dd5 11/4/2024||
Yeah I gave it a shot, it didn't have much over SQLite other than the file size was a lot smaller.

Also, I love ipinfo excellent service! We use it all the time at work.

reincoder 11/5/2024||
Awesome. We’ve integrated with platforms like GCP BQ, Snowflake, Clickhouse, and Postgres etc. and still, MMDB stands out as the most performant, especially for enriching millions of IPv6 addresses.

The best approach might be to enrich IP addresses via an MMDB database, then dump the results in temp table and query the results as you would any table. If you haven't tried the free db, you should as this illustrate this point pretty well: https://ipinfo.io/products/free-ip-database

It’s great to hear that you’re using our service! Please feel free to reach out anytime if you have any queries/feedback. This thread alone contains comments from 2 IPinfo engineers and me!

alam2000 10/28/2024||
[dead]
OhNoNotAgain_99 10/27/2024|
[dead]