Posted by lifty 6/28/2025
https://www.dolthub.com/blog/2025-06-26-prolly-tree-balance/
If it was not probabilistic then the balancing would be guaranteed in all cases. This typically means that it somehow stores balancing information somewhere so that it can detect when something is unbalanced and repair it. In this data structure we’re just hashing the content without really caring about the current balance and then it turns out that for most inputs it will be fine.
In principle, if your data were specially crafted to exploit the specific hash function (and salt) you could get an aberrant case like 1 million entries in a single b-tree node or a million b-tree nodes with just one entry. But unless you intentionally exploit the hash function the chance of this is vanishingly small.