How to Benchmark (Python) Code
--
While preparing to write the “Writing Faster Python” series, the first problem I faced was:
“How do I benchmark a piece of code in an objective yet uncomplicated way?”
I could run python -m timeit <piece of code>
, which is probably the simplest way of measuring how long it takes to execute some code¹. But maybe it's too simple, and I owe my readers some way of benchmarking that won't be interfered by sudden CPU spikes on my computer?
So here are a couple of different tools and techniques I tried. At the end of the article, I will tell you which one I chose and why. Plus, I will give you some rules of thumb for when each tool might be handy.
python -m timeit
The easiest way to measure how long it takes to run some code is to use the timeit module. You can write python -m timeit your_code()
, and Python will print out how long it took to run whatever your_code()
does. I like to put the code I want to benchmark inside a function for more clarity, but you don't have to do this. You can directly write multiple Python statements separated by semicolons, and that will work just fine. For example, to see how long it takes to sum up the first 1,000,000 numbers, we can run this code:
python -m timeit "sum(range(1_000_001))"
20 loops, best of 5: 11.5 msec per loop
However, python -m timeit
approach has a major drawback - it doesn't separate the setup code from the code you want to benchmark. Let's say you have an import statement that takes a relatively long time to import compared to executing a function from that module. One such import can be import numpy
. If we benchmark those two lines of code:
import numpy
numpy.arange(10)
the import will take most of the time during the benchmark. But you probably don’t want to benchmark how long it takes to import modules. You want to see how long it takes to execute some functions from that module.
python -m timeit -s “setup code”
To separate the setup code from the benchmarks, timeit supports -s
parameter. Whatever code you pass here will be executed but won't be part of the benchmarks. So we can improve the above code and run it like this: python -m
…