What are skiplists good for?

Posted by mfiguiere 2 days ago

What are skiplists good for?(antithesis.com)

261 points | 63 comments

cremer 15 hours ago|

Redis sorted sets are probably the most widely deployed example. Redis uses a skiplist for range queries and ordered iteration paired with a hash table for O(1) lookups. Together they cover the full API at the right complexity for each operation

Skiplists also win over balanced BSTs when it comes to concurrent access. Lock-free implementations are much simplier to reason about and get right. ConcurrentSkipListMap has been in the standard library since Java 6 for exactly this reason and it holds up well under high contention

antirez 12 hours ago||

Yep, and it was simple in Redis to augment them with the "span" in order to support ranking, that is, given al element, tell me its position in the ordered collection.

nulltrace 1 hour ago|||

Rebalancing is what really kills you. A CAS loop on a flat list is pretty straightforward, you get it working and move on. But rotations? You've got threads mid-insert on nodes you're about to move around. It gets ugly fast. Skiplists just sidestep the whole thing since level assignment is basically a coin flip, nothing you need to keep consistent. Cache locality is worse, sure, but honestly on write-heavy paths I've never seen that be the actual bottleneck.

maerF0x0 6 hours ago|||

A source: https://stackoverflow.com/a/46841708

ignoramous 8 hours ago||

> Skiplists also win over balanced BSTs when it comes to concurrent access.

Balanced Skiplists search better than plain Skiplists which may skew (but balancing itself is expensive). Also, I've have found that finger search (especially with doubly-linked skiplist with million+ entries) instead of always looking for elements from the root/head is an even bigger win.

> ConcurrentSkipListMap has been in the standard library since Java 6 for exactly this reason and it holds up well under high contention

An amusing observation I lifted from OCaml's implementation (which inturn quotes Donald Knuth): MSBs of PRNG values have more "randomness" than LSBs (randomness affects "balance"): https://github.com/ocaml/ocaml/blob/389121d3/runtime/lf_skip...

---

Some neat refs from our codebase:

- Skip lists: Done right (2016), https://ticki.github.io/blog/skip-lists-done-right / https://archive.is/kwhnG

- An analysis of skip lists, https://eugene-eeo.github.io/blog/skip-lists.html / https://archive.is/ffCDr

- Skip lists, http://web.archive.org/web/20070212103148/http://eternallyco... / https://archive.is/nl3G8

carlsverre 11 hours ago||

(I used to work at SingleStore, and now work at Antithesis)

SingleStore (f.k.a. MemSQL) used lock-free skiplists extensively as the backing storage of their rowstore tables and indexes. Adam Prout (ex CTO) wrote about it here: https://www.singlestore.com/blog/what-is-skiplist-why-skipli...

When SingleStore added a Columnar storage option (LSM tree), L0 was simply a rowstore table. Since rowstore was already a highly optimized, durable, and large-scale storage engine, it allowed L0 to absorb a highly concurrent transactional write workload. This capability was a key part of SingleStore's ability to handle HTAP workloads. If you want to learn more, take a look at this paper which documents the entire system end-to-end: https://dl.acm.org/doi/10.1145/3514221.3526055

reitzensteinm 7 hours ago|

At the intersection of these two topics, does Antithesis have any capabilities around simulating memory ordering to validate lock free algorithms?

carlsverre 4 hours ago||

We support thread-pausing via instrumentation. This can cause threads to observe different interleavings, which can help uncover bugs in concurrent algorithms. At this time, we don't perform specific memory model fault injection or fuzzing.

ozgrakkurt 13 hours ago||

Some more links that are inside the article:

- More info about skiplists: https://arxiv.org/pdf/2403.04582

- Performance comparison with B-tree ?: https://db.cs.cmu.edu/papers/2018/mod342-wangA.pdf

- Other blog from Anthithesis about writing their own db: https://antithesis.com/blog/2025/testing_pangolin/

Also I find it a bit hard to understand the performance outcome of this setup.

I know formats like parquet and databases like ClickHouse work better when duplicating data instead of doing joins. I guess BigQuery is similar.

The article is great but would be also interesting to learn how performance actually worked out with this.

nz 8 hours ago|

Back in 2014, I did an analysis of (single threaded) CPU-efficiency and RAM-efficiency of various data-structures (skiplists, slablists, avl-trees, rb-trees, b-trees):

https://nickziv.wordpress.com/wp-content/uploads/2014/02/vis...

I used whatever I could find on the internet at the time, so the comparison compares both algorithm and implementation (they were all written in C, but even slight changes to the C code can change performance -- uuavl performs much better than all other avl variants, for example). I suspect that a differently-programmed skip-list would not have performed quite so poorly.

The general conclusion from all this, is that any data-structure that can organize itself _around_ page-sizes and cache-sizes, will perform very well compared to structures that cannot.

josephg 17 hours ago||

For this problem, I’d consider a different approach. You have a fuzzer, and based on some seed it’s generating lots of records. You then need to query a specific record (or set of records) based on the leaf.

I’d just store a table of records with the leaf, associated with the seed. A good fuzzer is entirely deterministic. So you should be able to regenerate the entire run from simply knowing the seed. Just store a table of {leaf, seed}. Then gather all the seeds which generated the leaf you’re interested in and rerun the fuzzer for those seeds at query time to figure out what choices were made.

_dain_ 13 hours ago|

(I work at Antithesis)

Yes, this is (more or less) how we regenerate the system state, when necessary. But keep in mind that the fuzzing target is a network of containers, plus a whole Linux userland, plus the kernel. And these workloads often run for many minutes in each timeline. Regenerating the entire state from t=0 would be far too computationally intensive on the "read path", when all you want are the logs leading up to some event. We only do it on the "write path", when there's a need to interact with the system by creating new branching timelines. And even then, we have some smart snapshotting so that you're not always paying the full time cost from t=0; we trade off more memory usage for lower latency.

Oh one other thing: the "fuzzer" component itself is not fully deterministic. It can't be, because it also has to forward arbitrary user input into the simulation component (which is deterministic). If you decide to rewind to some moment and run a shell command, that's an input which can't be recovered from a fixed random seed. So in practice we explicitly store all the inputs that were fed in.

bob1029 17 hours ago||

On practical machines they aren't good for much. To access a value in a skip list you have to dereference way more pointers than in a b+ tree. On paper they're about the same, but in practice the binary tree will tend to outperform. You get way more work done per IO operation.

marginalia_nu 15 hours ago|

Skiplists are designed for fast intersection, not for single value lookup (assuming a sane design that's not based on linked lists, that's just an educational device that's never used in practice).

They are extremely good at intersections, as you can use the skip pointers in clever ways to skip ahead and eliminate whole swathes of values. You can kinda do that with b-trees[1] as well, but skip lists can beat them out in many cases.

It's highly dependent on the shape of the data though. For random numbers, it's probably an even match, but for postings lists and the like (where skiplists are often used), they perform extremely well as these often have runs of semiconsecutive values being intersected.

[1] I'll argue that if you squint a bit, a skiplist arguably is a sort of degenerate b-tree.

senderista 1 hour ago|||

Treaps can handle parallel set operations very efficiently:

https://www.cs.cmu.edu/~scandal/papers/treaps-spaa98.pdf

torginus 11 hours ago|||

I don't understand how they differ in this regard from range trees, which they essentially are, just their method of construction is different.

Things like BSP trees are very good at intersections indeed, and have been used for things since time immemorial, but I think the skiplist/tree tradeoff is not that different in this domain.

ahartmetz 20 hours ago||

Skiplists have some nice properties - the code is fairly short and easy to understand, for one. Qt's QMap used to be skip list based, here's the rationale given for it: https://doc.qt.io/archives/qq/qq19-containers.html#associati...

dwdz 15 hours ago|

It seems like Qt went from red-black tree to skip list in Qt4 and back to red-black tree in Qt5.

torginus 11 hours ago||

yeah it turns out that complex code, when its properly encapsulated and implemented in a bug-free manner, is not such a cost after all.

A correct skiplist is easier to NIH than a correct red-black tree (which for me was the final boss of the DS class in college), but has performance edge cases a red-black tree doesnt, if you treat it like a search tree.

ahartmetz 7 hours ago||

I think it was more about binary size. There are a few sentences in the Qt containers documentation about them being "optimized to minimize code expansion".

torginus 4 hours ago||

I mean that could mean a lot of things. By default, in C++, an std::vector<float> and std::vector<int> are two entirey separate classes with distinct methods getting compiled, even though in this particular the methods would be identical. Moreover, since templates are in headers, every object file gets their own compiled copy.

I'm sure there's some cleverness in there to mitigate the problem somewhat, but the problem still fundamentally exists

You can mitigate this manually to some degree, by making the generic classes call out to non-generic functions, of which there's guaranteed to be just 1 copy. I'm guessing that's what Qt does (among other things) when mentioning this

outpost_mystic2 31 minutes ago|||

Sorry if this is a dumb question but wouldn't vector<float> and vector<int> generate different code on get/set? Since floats go in different registers/need different instructions to load/store them to memory? Or is it something like, actually all of the std::vector functions technically operate on T&, and manipulating the pointers can use the same code, and really it's the optimizer that has to think about float vs int registers or something

ahartmetz 3 hours ago|||

Well, you can do two or three main things:

- Use an algorithm and implementation that needs a small amount of code for each instance (that is where the skip list is useful)

- Have a shared "out of line" (not inline) implementation for bulky parts of the implementation, where possible

- Support the compiler + linker feature to merge identical functions by making code identical between template instantiations of similar enough types

talideon 4 hours ago||

Plenty, but these days mostly if you either (a) want a simple implementation or (b) don't have to worry much about cache locality. The problem is that (b) doesn't really exist outside of retrocomputing and embedded systems these days. It's still one of my favourite data structures, just because it's a clever way to get most of the benefits of more complicated datastructures on small systems with minimal code.

repsilat 3 hours ago|

Occasionally in pure python "asymptotically reasonable, simple implementation, terrible memory locality" is the right pick for a data structure. You want to avoid an extra linear term, you don't care too much about the constant factor, and the cache doesn't really matter because the language is so slow that it's not really noticeable.

Not too common, but it happens.

winwang 19 hours ago||

Only somewhat related but there is supposedly a SIMD/GPU-friendly skiplist algo written about here: https://csaws.cs.technion.ac.il/~erez/Papers/GPUSkiplist.pdf

maxtaco 6 hours ago||

Backpointers to earlier epochs in append-only cryptographic data structures like key transparency logs. If the client last fetched epoch 1000, and the server reports the current epoch is 3000, the server can return log(2000) intermediate epochs, following skip pointers, to provide a hash chain from epoch 3000 to 1000.

https://github.com/foks-proj/go-foks/blob/main/lib/merkle/bi... https://github.com/foks-proj/go-foks/blob/main/lib/merkle/cl...

torginus 11 hours ago|

>What are skiplists good for

In practice, I have found out, nothing much. Their appeal comes from being simpler to implement than self-balancing trees, while claiming to offer the same performance.

But they completely lack a mechanism for rebalancing, and are incredibly pointer heavy (in this implementation at least), and inserts/deletes can involve an ungodly amount of pointer patching.

While I think there are some append-heavy access patterns where it can come up on top, I have found that the gap between using a BST, a hashtable, or just putting stuff in an array and just sorting it when needed is very small.

senderista 2 hours ago||

Lock-free BSTs or b-trees exist only in research papers, but lock-free skiplists are straightforward to implement.

CyberDildonics 7 hours ago||

You're right, but it's not popular to go against the premise of a post.

More comments...