Python Joblib Performance
I need to run an embarrassingly parallel for loop. After a quick search, I found package joblib for python. I did a simple test as posted on the package's website. Here is the tes
Solution 1:
Joblib
creates new processes to run the functions you want to execute in parallel. However, creating processes can take some time (around 500ms), especially now that joblib uses spawn
to create new processes (and not fork
).
Because the function you want to run in parallel is very fast to run, the result of %timeit
here mostly shows the overhead of process creation. If you choose a function that runs during a time that is not negligible compared to the time required to start new processes, you will see some improvements in performance:
Here is a sample you can run to test this:
import time
import joblib
from joblib import Parallel, delayed
def f(x):
time.sleep(1)
return x
def bench_joblib(n_jobs):
start_time = time.time()
Parallel(n_jobs=n_jobs)(delayed(f)(x) for x in range(4))
print('running 4 times f using n_jobs = {} : {:.2f}s'.format(
n_jobs, time.time()-start_time))
if __name__ == "__main__":
bench_joblib(1)
bench_joblib(4)
I got, using python 3.7 and joblib 0.12.5
running4 times f using n_jobs =1 : 4.01s
running4 times f using n_jobs =4 : 1.34s
Post a Comment for "Python Joblib Performance"