Rust intentionally provides the simplest possible growable string buffer String, which is literally (under the hood, you can't poke this legitimately) Vec<u8> plus the promise that this is UTF-8 text.
But you might find your needs better served by one (or several) of:
Box<str> -- you don't need capacity, so, don't store it => length == capacity
CompactString -- use the entire 24 bytes for SSO, up to 24 bytes of UTF-8 inline, obviously doesn't make sense if all or the vast majority of your strings are 25 bytes or longer
ColdString -- same idea but for 8 bytes, and also not storing capacity, this only makes sense over Box<str> if you have plenty of <= 8 byte strings
Atoms: Each string can be referenced with a single u32 or even u16, and they're inherently deduplicated.
Bump allocator: your strings are &str, allocation is super fast with limited fragmentation.
Single pointer strings (this has a name, I can't think of it right now): you store the length inside the allocation instead of in each reference, so your strings are a single pointer.
Perhaps because this feels like a fairly rust-specific gotcha. Especially if you're coming from languages where there's often not much syntactical distinction made between "this is a pointer because I don't want to be copying it" and "this is a pointer because it's optional."
For instance, it's not until now that I actually understood what the sibling comment about the Enum type size discrepancy lint meant: "This lint obviously cannot take the distribution of variants in your running program into account. It is possible that the smaller variants make up less than 1% of all instances, in which case the overhead is negligible and the boxing is counter-productive. Always measure the change this lint suggests." I had always accidentally read this backwards, thinking it meant something more to the effect of "if most of the instances are actually small, then it's not a problem here, but be aware that some of them are much larger so some of your calls to things with this could end up passing much larger types."
Clippy is essentially a linter; and one of its checks catches cases where different enum variants have a significantly different size; with a suggestion to Box the larger variant.
Since this is just a linter, it doesn't actually have any knowledge of how frequently each variant is actually used. It also doesn't address the situation in the article at all.
It's especially problematic because traits don't have memory behaviors like this article in most cases - by default they're unsized, because it's a description of behavior, not data, and you can't even use them as a struct field without extra work.
Like, replace "trait" in here with "box" and see how confusing it would be to be describing how you saved memory by boxing your box, because option doesn't box like many other languages do.
> a lot of boxes means a fragmented heap. In such case it's not a problem but this might be worth keeping in mind.
A good malloc will be able to handle this without issue due to various optimizations specifically that inherently fight fragmentation. Default Linux malloc (glibc) may have issues but I did say good malloc (and even glibc generally shouldn’t struggle with the pattern described I think).
Without that, if you try to suggest a transformation like this when the schema is first conceived, it will likely be considered premature optimization.