Summing With A For Loop Faster Than With Reduce?
Solution 1:
The reduce(add, r)
must invoke the add()
function 100 times, so the overhead of the function calls adds up -- reduce uses PyEval_CallObject
to invoke add
on each iteration:
for (;;) {
...
if (result == NULL)
result = op2;
else {
# here it is creating a tuple to pass the previous result and the next# value from range(100) into func add():
PyTuple_SetItem(args, 0, result);
PyTuple_SetItem(args, 1, op2);
if ((result = PyEval_CallObject(func, args)) == NULL)
goto Fail;
}
Updated: Response to question in comments.
When you type 1 + 2
in Python source code, the bytecode compiler performs the addition in place and replaces that expression with 3
:
f1 = lambda: 1 + 2
c1 = byteplay.Code.from_code(f1.func_code)
print c1.code
1 1 LOAD_CONST 3
2 RETURN_VALUE
If you add two variables a + b
the compiler will generate bytecode which loads the two variables and performs a BINARY_ADD, which is far faster than calling a function to perform the addition:
f2 = lambda a, b: a + b
c2 = byteplay.Code.from_code(f2.func_code)
print c2.code
1 1 LOAD_FAST a
2 LOAD_FAST b
3 BINARY_ADD
4 RETURN_VALUE
Solution 2:
edit: Switching out zeroes instead of array multiply closes the gap big time.
from functools import reduce
from numpy import array, arange, zeros
fromtime import time
def add(x, y):
return x + y
def sum_columns(x):
if x.any():
width = len(x[0])
total = zeros(width)
for row in x:
total += array(row)
return total
l = arange(3000000)
l = array([l, l, l])
start = time()
print(reduce(add, l))
print('Reduce took {}'.format(time() - start))
start = time()
print(sum_columns(l))
print('For loop took took {}'.format(time() - start))
Gets you down almost no difference at all.
Reduce took 0.03230619430541992
For loop took took 0.058577775955200195
old: If reduce is used for adding together NumPy arrays by index, it can be faster than a for loop.
from functools import reduce
from numpy import array, arange
fromtime import time
def add(x, y):
return x + y
def sum_columns(x):
if x.any():
width = len(x[0])
total = array([0] * width)
for row in x:
total += array(row)
return total
l = arange(3000000)
l = array([l, l, l])
start = time()
print(reduce(add, l))
print('Reduce took {}'.format(time() - start))
start = time()
print(sum_columns(l))
print('For loop took took {}'.format(time() - start))
With the result of
[ 0 3 6 ..., 8999991 8999994 8999997]
Reduce took 0.024930953979492188[ 0 3 6 ..., 8999991 8999994 8999997]
For loop took took 0.3731539249420166
Post a Comment for "Summing With A For Loop Faster Than With Reduce?"