Extended examples¶
BenchmarkBuilder¶
The simple_benchmark.BenchmarkBuilder
class can be used to build a benchmark using decorators, essentially
it is just a wrapper around simple_benchmark.benchmark()
.
For example to compare different approaches to calculate the sum of a list of floats:
from simple_benchmark import BenchmarkBuilder
import math
bench = BenchmarkBuilder()
@bench.add_function()
def sum_using_loop(lst):
sum_ = 0
for item in lst:
sum_ += item
return sum_
@bench.add_function()
def sum_using_range_loop(lst):
sum_ = 0
for idx in range(len(lst)):
sum_ += lst[idx]
return sum_
bench.use_random_lists_as_arguments(sizes=[2**i for i in range(2, 15)])
bench.add_functions([sum, math.fsum])
b = bench.run()
b.plot()
# To save the plotted benchmark as PNG file.
import matplotlib.pyplot as plt
plt.savefig('sum_list_example.png')
MultiArgument¶
The simple_benchmark.MultiArgument
class can be used to provide multiple arguments to the
functions that should be benchmarked:
from itertools import starmap
from operator import add
from random import random
from simple_benchmark import BenchmarkBuilder, MultiArgument
bench = BenchmarkBuilder()
@bench.add_function()
def list_addition_zip(list1, list2):
res = []
for item1, item2 in zip(list1, list2):
res.append(item1 + item2)
return res
@bench.add_function()
def list_addition_index(list1, list2):
res = []
for idx in range(len(list1)):
res.append(list1[idx] + list2[idx])
return res
@bench.add_function()
def list_addition_map_zip(list1, list2):
return list(starmap(add, zip(list1, list2)))
@bench.add_arguments(name='list sizes')
def benchmark_arguments():
for size_exponent in range(2, 15):
size = 2**size_exponent
arguments = MultiArgument([
[random() for _ in range(size)],
[random() for _ in range(size)]])
yield size, arguments
b = bench.run()
b.plot()
# To save the plotted benchmark as PNG file.
import matplotlib.pyplot as plt
plt.savefig('list_add_example.png')
Asserting correctness¶
Besides comparing the timings it’s also important to assert that the approaches actually produce the same outcomes and don’t modify the input arguments.
To compare the results there is simple_benchmark.assert_same_results()
(or in case you use BenchmarkBuilder simple_benchmark.BenchmarkBuilder.assert_same_results()
):
import operator
import random
from simple_benchmark import assert_same_results
funcs = [min, max] # will produce different results
arguments = {2**i: [random.random() for _ in range(2**i)] for i in range(2, 10)}
assert_same_results(funcs, arguments, equality_func=operator.eq)
And to compare that the inputs were not modified simple_benchmark.assert_not_mutating_input()
(or in case you use BenchmarkBuilder simple_benchmark.BenchmarkBuilder.assert_not_mutating_input()
):
import operator
import random
from simple_benchmark import assert_not_mutating_input
def sort(l):
l.sort() # modifies the input
return l
funcs = [sorted, sort]
arguments = {2**i: [random.random() for _ in range(2**i)] for i in range(2, 10)}
assert_not_mutating_input(funcs, arguments, equality_func=operator.eq)
Both will produce an AssertionError
if they gave different results or mutate the input arguments.
Typically the equality_func
will be one of these:
operator.eq()
will work for most Python objects.math.isclose()
will work forfloat
that may be close but not equal.numpy.array_equal
will work for element-wise comparison of NumPy arrays.numpy.allclose
will work for element-wise comparison of NumPy arrays containing floats that may be close but not equal.
The simple_benchmark.assert_not_mutating_input()
also accepts an optional argument that needs to be used in case
the argument is not trivially copyable. It expects a function that takes the argument as input and should
return a deep-copy of the argument.
Times for each benchmark¶
The benchmark will run each function on each of the arguments for a certain amount of times. Generally the results will be more accurate if one increases the number of times the function is executed during each benchmark. But the benchmark will also take longer.
To control the time one benchmark should take one can use the time_per_benchmark
argument. This controls how much
time each function will take for each argument. The default is 0.1s (100 milliseconds) but the value is ignored
for calls that either take very short (then it will finish faster) or very slow (because the benchmark tries to do at
least a few calls).
Another option is to control the maximum time a single function call may take maximum_time
. If the first call of this
function exceeds the maximum_time
the function will be excluded from the benchmark from this argument on.
- To control the quality of the benchmark the
time_per_benchmark
can be used. - To avoid excessive benchmarking times one can use
maximum_time
.
An example showing both in action:
from simple_benchmark import benchmark
from datetime import timedelta
def O_n(n):
for i in range(n):
pass
def O_n_squared(n):
for i in range(n ** 2):
pass
def O_n_cube(n):
for i in range(n ** 3):
pass
b = benchmark(
[O_n, O_n_squared, O_n_cube],
{2**i: 2**i for i in range(2, 15)},
time_per_benchmark=timedelta(milliseconds=500),
maximum_time=timedelta(milliseconds=500)
)
b.plot()
# To save the plotted benchmark as PNG file.
import matplotlib.pyplot as plt
plt.savefig('time_example.png')
Examples on StackOverflow¶
In some cases it’s probably best to see how it can be used on some real-life problems:
- Count the number of non zero values in a numpy array in Numba
- When numba is effective?
- Range with repeated consecutive numbers
- Concatenate tuples using sum()
- How to retrieve an element from a set without removing it?
- What exactly is the optimization “functools.partial” is making?
- Nested lambda statements when sorting lists
- How to make a flat list out of list of lists?
- How do you remove duplicates from a list whilst preserving order?
- Iterating over every two elements in a list
- Cython - efficiently filtering a typed memoryview
- Python’s sum vs. NumPy’s numpy.sum
- Finding longest run in a list
- Remove duplicate dict in list in Python
- How do I find the duplicates in a list and create another list with them?
- Suppress key addition in collections.defaultdict
- Numpy first occurrence of value greater than existing value
- Count the number of times an item occurs in a sequence using recursion Python
- Converting a series of ints to strings - Why is apply much faster than astype?