Top
Best
New

Posted by WoodenChair 9 hours ago

Python numbers every programmer should know(mkennedy.codes)
214 points | 97 commentspage 3
tgv 7 hours ago|
I doubt list and string concatenation operate in constant time, or else they affect another benchmark. E.g., you can concatenate two lists in the same time, regardless of their size, but at the cost of slower access to the second one (or both).

More contentiously: don't fret too much over performance in Python. It's a slow language (except for some external libraries, but that's not the point of the OP).

jerf 7 hours ago|
String concatenation is mentioned twice on that page, with the same time given. The first time it has a parenthetical "(small)", the second time doesn't have it. I expect you were looking at the second one when you typed that as I would agree that you can't just label it as a constant time, but they do seem to have meant concatenating "small" strings, where the overhead of Python's object construction would dominate the cost of the construction of the combined string.
mikeckennedy 6 hours ago||
Author here.

Thanks for the feedback everyone. I appreciate your posting it @woodenchair and @aurornis for pointing out the intent of the article.

The idea of the article is NOT to suggest you should shave 0.5ns off by choosing some dramatically different algorithm or that you really need to optimize the heck out of everything.

In fact, I think a lot of what the numbers show is that over thinking the optimizations often isn't worth it (e.g. caching len(coll) into a variable rather than calling it over and over is less useful that it might seem conceptually).

Just write clean Python code. So much of it is way faster than you might have thought.

My goal was only to create a reference to what various operations cost to have a mental model.

willseth 4 hours ago|
Then you should have written that. Instead you have given more fodder for the premature optimization crowd.
jchmbrln 6 hours ago||
What would be the explanation for an int taking 28 bytes but a list of 1000 ints taking only 7.87KB?
wiml 5 hours ago|
That appears to be the size of the list itself, not including the objects it contains: 8 bytes per entry for the object pointer, and a kilo-to-kibi conversion. All Python values are "boxed", which is probably a more important thing for a Python programmer to know than most of these numbers.

The list of floats is larger, despite also being simply an array of 1000 8-byte pointers. I assume that it's because the int array is constructed from a range(), which has a __len__(), and therefore the list is allocated to exactly the required size; but the float array is constructed from a generator expression and is presumably dynamically grown as the generator runs and has a bit of free space at the end.

lopuhin 4 hours ago|||
That's impressive how you figured out the reason for the difference in list of floats vs list of ints container size, framed as an interview question that would have been quite difficult I think
mikeckennedy 4 hours ago|||
It was. I updated the results to include the contained elements. I also updated the float list creation to match the int list creation.
oogali 7 hours ago||
It's important to know that these numbers will vary based on what you're measuring, your hardware architecture, and how your particular Python binary was built.

For example, my M4 Max running Python 3.14.2 from Homebrew (built, not poured) takes 19.73MB of RAM to launch the REPL (running `python3` at a prompt).

The same Python version launched on the same system with a single invocation for `time.sleep()`[1] takes 11.70MB.

My Intel Mac running Python 3.14.2 from Homebrew (poured) takes 37.22MB of RAM to launch the REPL and 9.48MB for `time.sleep`.

My number for "how much memory it's using" comes from running `ps auxw | grep python`, taking the value of the resident set size (RSS column), and dividing by 1,024.

1: python3 -c 'from time import sleep; sleep(100)'

lunixbochs 4 hours ago||
I'm confused why they repeatedly call a slots class larger than a regular dict class, but don't count the size of the dict
belabartok39 4 hours ago||
Hmmmm, there should absolutely be standard deviations for this type of work. Also, what is N number of runs? Does it say somewhere?
mikeckennedy 2 hours ago|
It is open source, you could just look. :) But here is a summary for you. It's not just one run and take the number:

Benchmark Iteration Process

Core Approach:

- Warmup Phase: 100 iterations to prepare the operation (default)

- Timing Runs: 5 repeated runs (default), each executing the operation a specified number of times

- Result: Median time per operation across the 5 runs

Iteration Counts by Operation Speed: - Very fast ops (arithmetic): 100,000 iterations per run

- Fast ops (dict/list access): 10,000 iterations per run

- Medium ops (list membership): 1,000 iterations per run

- Slower ops (database, file I/O): 1,000-5,000 iterations per run

Quality Controls:

- Garbage collection is disabled during timing to prevent interference

- Warmup runs prevent cold-start bias

- Median of 5 runs reduces noise from outliers

- Results are captured to prevent compiler optimization elimination

Total Executions: For a typical benchmark with 1,000 iterations and 5 repeats, each operation runs 5,100 times (100 warmup + 5×1,000 timed) before reporting the median result.

belabartok39 2 hours ago||
That answers what N is (why not just say in the article). If you are only going to report medians, is there an appendix with further statistics such as confidence intervals or standard deviations. For serious benchmark, it would be essential to show the spread or variability, no?
mwkaufma 6 hours ago||
Why? If those micro benchmarks mattered in your domain, you wouldn't be using python.
coldtea 6 hours ago||
That's an "all or nothing" fallacy. Just because you use Python and are OK with some slowdown, doesn't mean you're OK with each and every slowdown when you can do better.

To use a trivial example, using a set instead of a list to check membership is a very basic replacement, and can dramatically improve your running time in Python. Just because you use Python doesn't mean anything goes regarding performance.

mwkaufma 5 hours ago||
That's an example of an algorithmic improvement (log n vs n), not a micro benchmark, Mr. Fallacy.
PhilipRoman 4 hours ago||
...and other hilarious jokes you can tell yourself!
dr_kretyn 7 hours ago||
Initially I thought how efficient strings are... but then I understood how inefficient arithmetic is. Interesting comparison but exact speed and IO depend on a lot of things, and unlikely one uses Mac mini in production so these numbers definitely aren't representative.
Retr0id 5 hours ago||
> Numbers are surprisingly large in Python

Makes me wonder if the cpython devs have ever considered v8-like NaN-boxing or pointer stuffing.

woodruffw 7 hours ago|
Great reference overall, but some of these will diverge in practice: 141 bytes for a 100 char string won’t hold for non-ASCII strings for example, and will change if/when the object header overhead changes.
More comments...