Skip to content Skip to sidebar Skip to footer

Python: How To Parallelize A Loop With Dictionary

EDITED: I have a code which looks like: __author__ = 'feynman' cimport cython @cython.boundscheck(False) @cython.wraparound(False) @cython.nonecheck(False) def MC_Surface(volume,

Solution 1:

Since I don't quite understand your code and you said "any hints will be great", I'll give you some general suggestions. Basically you want to speed up a for loop

for i, this_key in enumerate(keys):

What you could do is split the keys array into several parts, something like this:

length = len(keys)
part1 = keys[:length/3]
part2 = keys[length/3: 2*length/3]
part3 = keys[2*length/3:]

Then deal with each part in a subprocess:

from concurrent.futures import ProcessPoolExecutor

defdo_work(keys):
    for i, this_key inenumerate(keys): 
        mc_vol[tmp_vol == this_key] = values[i]

with ProcessPoolExecutor(max_workers=3) as e:
    e.submit(do_work, part1)
    e.submit(do_work, part2)
    e.submit(do_work, part3)

return mc_vol

And that's it.

Solution 2:

First, dictionary lookup takes closely constant time, although array equality check is O(N). Thus you should loop over your array and not your dictionary.

Second, you could save much time with nested list comprehension (change python loop to C loop).

mc_vol = [[Perm_area[key] for key in row] for row in tmp_vol]

This gives you a list of lists, so you can completely avoid numpy in this case. Although if you need a numpy array, just convert:

mc_vol = np.array(mc_vol)

Post a Comment for "Python: How To Parallelize A Loop With Dictionary"