Top
Best
New

Posted by WoodenChair 5 hours ago

Python numbers every programmer should know(mkennedy.codes)
162 points | 76 comments
thundergolfer 1 hour ago|
A lot of people here are commenting that if you have to care about specific latency numbers in Python you should just use another language.

I disagree. A lot of important and large codebases were grown and maintained in Python (Instagram, Dropbox, OpenAI) and it's damn useful to know how to reason your way out of a Python performance problem when you inevitably hit one without dropping out into another language, which is going to be far more complex.

Python is a very useful tool, and knowing these numbers just makes you better at using the tool. The author is a Python Software Foundation Fellow. They're great at using the tool.

In the common case, a performance problem in Python is not the result of hitting the limit of the language but the result of sloppy un-performant code, for example unnecessarily calling a function O(10_000) times in a hot loop.

I wrote up a more focused "Python latency numbers you should know" as a quiz here https://thundergolfer.com/computers-are-fast

i_am_a_peasant 3 minutes ago||
our build system is written in python, and i’d like it not to suck but still stay in python, so these numbers very much matter.
nutjob2 59 minutes ago|||
> A lot of important and large codebases were grown and maintained in Python

How does this happen? Is it just inertia that cause people to write large systems in a essentially type free, interpreted scripting language?

hibikir 15 minutes ago|||
Small startups end up writing code in whatever gets things working faster, because having too large a codebase with too much load is a champagne problem.

If I told you that we were going to be running a very large payments system, with customers from startups to Amazon, you'd not write it in ruby and put the data in MongoDB, and then using its oplog as a queue... but that's what Stripe looked like. They even hired a compiler team to add type checking to the language, as that made far more sense than porting a giant monorepo to something else.

xboxnolifes 53 minutes ago||||
It's very simple. Large systems start as small systems.
oivey 50 minutes ago||||
It’s a nice and productive language. Why is that incomprehensible?
oofbey 58 minutes ago|||
It’s very natural. Python is fantastic for going from 0 to 1 because it’s easy and forgiving. So lots of projects start with it. Especially anything ML focused. And it’s much harder to change tools once a project is underway.
passivegains 24 minutes ago||
this is absolutely true, but there's an additional nuance: yes, python is fantastic, yes, it's easy and forgiving, but there are other languages like that too. ...except there really aren't. other than ruby and maybe go, every other popular language sacrifices ease of use for things that simply do not matter for the overwhelming majority of programs. much of python's popularity doesn't come from being easy and forgiving, it's that everything else isn't. for normal programming why would we subject ourselves to anything but python unless we had no choice?

while I'm on the soapbox I'll give java a special mention: a couple years ago I'd have said java was easy even though it's tedious and annoying, but I've become reacquainted with it for a high school program (python wouldn't work for what they're doing and the school's comp sci class already uses java.)

this year we're switching to c++.

oofbey 59 minutes ago||
I think both points are fair. Python is slow - you should avoid it if speed is critical, but sometimes you can’t easily avoid it.

I think the list itself is super long winded and not very informative. A lot of operations take about the same amount of time. Does it matter that adding two ints is very slightly slower than adding two floats? (If you even believe this is true, which I don’t.) No. A better summary would say “all of these things take about the same amount of time: simple math, function calls, etc. these things are much slower: IO.” And in that form the summary is pretty obvious.

microtonal 36 minutes ago||
I think the list itself is super long winded and not very informative.

I agree. I have to complement the author for the effort put in. However it misses the point of the original Latency numbers every programmer should know, which is to build an intuition for making good ballpark estimations of the latency of operations and that e.g. A is two orders of magnitude more expensive than B.

fooker 3 hours ago||
Counterintuitively: program in python only if you can get away without knowing these numbers.

When this starts to matter, python stops being the right tool for the job.

libraryofbabel 3 hours ago||
Or keep your Python scaffolding, but push the performance-critical bits down into a C or Rust extension, like numpy, pandas, PyTorch and the rest all do.

But I agree with the spirit of what you wrote - these numbers are interesting but aren’t worth memorizing. Instead, instrument your code in production to see where it’s slow in the real world with real user data (premature optimization is the root of all evil etc), profile your code (with pyspy, it’s the best tool for this if you’re looking for cpu-hogging code), and if you find yourself worrying about how long it takes to add something to a list in Python you really shouldn’t be doing that operation in Python at all.

eichin 2 hours ago||
"if you're not measuring, you're not optimizing"
bathtub365 6 minutes ago|||
These basically seem like numbers of last resort. After you’ve profiled and ruled out all of the usual culprits (big disk reads, network latency, polynomial or exponential time algorithms, wasteful overbuilt data structures, etc) and need to optimize at the level of individual operations.
Demiurge 1 hour ago|||
I agree. I've been living off Python for 20 years and have never needed to know any of these numbers, nor do I need them now, for my work, contrary to the title. I also regularly use profiling for performance optimization and opt for Cython, SWIG, JIT libraries, or other tools as needed. None of these numbers would ever factor into my decision-making.
AtlasBarfed 7 minutes ago||
.....

You don't see any value in knowing that numbers?

Quothling 1 hour ago|||
Why? I've build some massive analytic data flows in Python with turbodbc + pandas which are basically C++ fast. It uses more memory which supports your point, but on the flip-side we're talking $5-10 extra cost a year. It could frankly be $20k a year and still be cheaper than staffing more people like me to maintain these things, rather than having a couple of us and then letting the BI people use the tools we provide for them. Similarily when we do embeded work, micro-python is just so much easier to deal with for our engineering staff.

The interoperability between C and Python makes it great, and you need to know these numbers on Python to know when to actually build something in C. With Zig getting really great interoperability, things are looking better than ever.

Not that you're wrong as such. I wouldn't use Python to run an airplane, but I really don't see why you wouldn't care about the resources just because you're working with an interpreted or GC language.

its-summertime 23 minutes ago|||
From the complete opposite side, I've built some tiny bits of near irrelevant code where python has been unacceptable, e.g. in shell startup / in bash's PROMPT_COMMAND, etc. It ends up having a very painfully obvious startup time, even if the code is nearing the equivalent of Hello World

    time python -I -c 'print("Hello World")'
    real    0m0.014s
    time bash --noprofile -c 'echo "Hello World"'
    real    0m0.001s
fooker 1 hour ago|||
> you need to know these numbers on Python to know when to actually build something in C

People usually approach this the other way, use something like pandas or numpy from the beginning if it solves your problem. Do not write matrix multiplications or joins in python at all.

If there is no library that solves your problem, it's a great indication that you should avoid python. Unless you are willing to spend 5 man-years writing a C or C++ library with good python interop.

oivey 44 minutes ago||
People generally aren’t rolling their own matmuls or joins or whatever in production code. There are tons of tools like Numba, Jax, Triton, etc that you can use to write very fast code for new, novel, and unsolved problems. The idea that “if you need fast code, don’t write Python” has been totally obsolete for over a decade.
fooker 25 minutes ago||
Yes, that's what I said.

If you are writing performance sensitive code that is not covered by a popular Python library, don't do it unless you are a megacorp that can put a team to write and maintain a library.

oivey 19 minutes ago||
It isn’t what you said. If you want, you can write your own matmul in Numba and it will be roughly as fast as similar C code. You shouldn’t, of course, for the same reason handrolling your own matmuls in C is stupid.

Many problems can performantly solved in pure Python, especially via the growing set of tools like the JIT libraries I cited. Even more will be solvable when things like free threaded Python land. It will be a minority of problems that can’t be, if it isn’t already.

MontyCarloHall 3 hours ago||
Exactly. If you're working on an application where these numbers matter, Python is far too high-level a language to actually be able to optimize them.
zelphirkalt 3 hours ago||
I doubt there is much to gain from knowing how much memory an empty string takes. The article or the listed numbers have a weird fixation on memory usage numbers and concrete time measurements. What is way more important to "every programmer" is time and space complexity, in order to avoid designing unnecessarily slow or memory hungry programs. Under the assumption of using Python, what is the use of knowing that your int takes 28 bytes? In the end you will have to determine, whether the program you wrote meats the performance criteria you have and if it does not, then you need a smarter algorithm or way of dealing with data. It helps very little to know that your 2d-array of 1000x1000 bools is so and so big. What helps is knowing, whether it is too much and maybe you should switch to using a large integer and a bitboard approach. Or switch language.
kingstnap 7 minutes ago||
I disagree. Performance is a leaky abstraction that *ALWAYS* matters.

Your cognition of it is either implicit or explicit.

Even if you didn't know for example that list appends was linear and not quadratic and fairly fast.

Even if you didn't give a shit if simple programs were for some reason 10000x slower than they needed to be because it meets some baseline level of good enough.

Library authors beneath you would know and the APIs you interact with and the pythonic code you see and the code LLMS generate will be affected by that leaky abstraction.

If you think that n^2 naive list appends is a bad example its not btw, python string appends are n^2 and that has and does affect how people do things, f strings for example are lazy.

Similarly a direct consequence of dictionaries being fast in Python is that they are used literally everywhere. The old Pycon 2017 talks from Raymond talk about this.

Qem 3 hours ago||
> Under the assumption of using Python, what is the use of knowing that your int takes 28 bytes?

Relevant if your problem demands instatiation of a large number of objects. This reminds me of a post where Eric Raymond discusses the problems he faced while trying to use Reposurgeon to migrate GCC. See http://esr.ibiblio.org/?p=8161

esafak 6 minutes ago||
The point of the original list was that the numbers were simple enough to memorize: https://gist.github.com/jboner/2841832

Nobody is going to remember any of the numbers on this new list.

perrygeo 21 minutes ago||
> small int (0-256) cached

It's -5 to 256, and these have very tricky behavior for programmers that confuse identity and equaltiy.

  >>> a = -5
  >>> b = -5
  >>> a is b
  True
  >>> a = -6
  >>> b = -6
  >>> a is b
  False
Aurornis 2 hours ago||
A meta-note on the title since it looks like it’s confusing a lot of commenters: The title is a play on Jeff Dean’s famous “Latency Numbers Every Programmer Should Know” from 2012. It isn’t meant to be interpreted literally. There’s a common theme in CS papers and writing to write titles that play upon themes from past papers. Another common example is the “_____ considered harmful” titles.
shanemhansen 1 hour ago||
Going to write a real banger of a paper called "latency numbers considered harmful is all you need" and watch my academic cred go through the roof.
Kwpolska 2 hours ago|||
This title only works if the numbers are actually useful. Those are not, and there are far too many numbers for this to make sense.
Aurornis 1 hour ago||
The title was meant to be taken literally, as in you're supposed to memorize all of these numbers. It was meant as an in-joke reference to the original writing to signal that this document was going to contain timing values for different operations.

I completely understand why it's frustrating or confusing by itself, though.

willseth 1 hour ago||
Good callout on the paper reference, but this author gives gives every indication that he’s dead serious in the first paragraph. I don’t think commenters are confused.
f311a 1 hour ago||

   > Strings
   >The rule of thumb for strings is the core string object takes 41 bytes. Each       additional character is 1 byte.
That's misleading. There are three types of strings in Python (1, 2 and 4 bytes per character).

https://rushter.com/blog/python-strings-and-memory/

willseth 3 hours ago||
Every Python programmer should be thinking about far more important things than low level performance minutiae. Great reference but practically irrelevant except in rare cases where optimization is warranted. If your workload grows to the point where this stuff actually matters, great! Until then it’s a distraction.
HendrikHensen 1 hour ago||
Having general knowledge about the tools you're working with is not a distraction, it's an intellectual enrichment in any case, and can be a valuable asset in specific cases.
willseth 1 hour ago||
Knowing that an empty string is 41 bytes or how many ns it takes to do arithmetic operations is not general knowledge.
oivey 40 minutes ago||
How is it not general knowledge? How do you otherwise gauge if your program is taking a reasonable amount of time, and, if not, how do you figure out how to fix it?
kc0bfv 2 hours ago|||
I agree - however, that has mostly been a feeling for me for years. Things feel fast enough and fine.

This page is a nice reminder of the fact, with numbers. For a while, at least, I will Know, instead of just feel, like I can ignore the low level performance minutiae.

amelius 3 hours ago||
Yeah, if you hit limits just look for a module that implements the thing in C (or write it). This is how it was always done in Python.
ryandrake 30 minutes ago|||
I am currently (as we type actually LOL) doing this exact thing in a hobby GIS project: Python got me a prototype and proof of concept, but now that I am scaling the data processing to worldwide, it is obviously too slow so I'm rewriting it (with LLM assistance) in C. The huge benefit of Python is that I have a known working (but slow) "reference implementation" to test against. So I know the C version works when it produces identical output. If I had a known-good Python version of past C, C++, Rust, etc. projects I worked on, it would have been most beneficial when it came time to test and verify.
willseth 2 hours ago|||
Sometimes it’s as simple as finding the hotspot with a profiler and making a simple change to an algorithm or data structure, just like you would do in any language. The amount of handwringing people do about building systems with Python is silly.
boerseth 2 hours ago||
That's a long list of numbers that seem oddly specific. Apart from learning that f-strings are way faster than the alternatives, and certain other comparisons, I'm not sure what I would use this for day-to-day.

After skimming over all of them, it seems like most "simple" operations take on the order of 20ns. I will leave with that rule of thumb in mind.

0x000xca0xfe 2 hours ago|
That number isn't very useful either, it really depends on the hardware. Most virtualized server CPUs where e.g. Django will run on in the end are nowhere near the author's M4 Pro.

Last time I benchmarked a VPS it was about the performance of an Ivy Bridge generation laptop.

giantrobot 1 hour ago||
> Last time I benchmarked a VPS it was about the performance of an Ivy Bridge generation laptop.

I have a number of Intel N95 systems around the house for various things. I've found them to be a pretty accurate analog for small instances VPSes. The N95 are Intel E-cores which are effectively Sandy Bridge/Ivy Bridge cores.

Stuff can fly on my MacBook but than drag on a small VPS instance but validating against an N95 (I already have) is helpful. YMMV.

riazrizvi 3 hours ago|
The titles are oddly worded. For example -

  Collection Access and Iteration
  How fast can you get data out of Python’s built-in collections? Here is a dramatic example of how much faster the correct data structure is. item in set or item in dict is 200x faster than item in list for just 1,000 items!
It seems to suggest an iteration for x in mylist is 200x slower than for x in myset. It’s the membership test that is much slower. Not the iteration. (Also for x in mydict is an iteration over keys not values, and so isn’t what we think of as an iteration on a dict’s ‘data’).

Also the overall title “Python Numbers Every Programmer Should Know” starts with 20 numbers that are merely interesting.

That all said, the formatting is nice and engaging.

More comments...