Skip to content Skip to sidebar Skip to footer

Python: Multiprocess Workers, Tracking Tasks Completed (missing Completions)

The default multiprocessing.Pool code includes a counter to keep track of the number of tasks a worker has completed: completed += 1 logging.debug('worker exiting after %d tas

Solution 1:

It works that way because you are not defining explicitly "chunksize" in pool.map:

map(func, iterable[, chunksize])

This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks. The (approximate) size of these chunks can be specified by setting chunksize to a positive integer

Source: https://docs.python.org/2/library/multiprocessing.html#module-multiprocessing.pool

For 8 items, considering a len(pool)=2, chunksize will be 1 ( divmod(8,2*4)) so you see (8/1)/2 workers = 4 workers

workers = (len of items / chunksize) /  tasks per process

For 20 items, considering a len(pool)=2, chunksize will be 3 (divmode(20,2*4)) so you see something like (20/3)/2 = 3.3 workers

For 40...chunksize=5, workers= (40/5)/5 = 4 workers

If you want, you can set chunksize=1

res = pool.map(ret_x, range(40), 1)

And you will see (20/1)/2 = 10 workers

python mppp.py
made a worker!
made a worker!
made a worker!
made a worker!
made a worker!
made a worker!
made a worker!
made a worker!
made a worker!
made a worker!
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

So chunksize is like the amount of unit work for a process...or something like that.

How to calc chunksize: https://hg.python.org/cpython/file/1c54def5947c/Lib/multiprocessing/pool.py#l305

Post a Comment for "Python: Multiprocess Workers, Tracking Tasks Completed (missing Completions)"