Skip to content Skip to sidebar Skip to footer

Dask-distributed. How To Get Task Key Id In The Function Being Calculated?

My computations with dask.distributed include creation of intermediate files whose names include UUID4, that identify that chunk of work. pairs = '{}\n{}\n{}\n{}'.format(list1

Solution 1:

There are two ways to approach the problem:

  1. You determine the uuid and pass it to Dask (implemented)
  2. Dask determines the uuid and passes it to your function (not implemented, but possible)

You pass the uuid to Dask

Functions like .submit accept a key= keyword argument where you can specify the key that you want used

>>>e.submit(inc, 1, key='inc-12345')
<Future: status: pending, key: inc-12345>

Similarly dask.delayed functions support a dask_key_name keyword argument

>>>value = delayed(inc)(1, dask_key_name='inc-12345')

You get the key from Dask

The scheduler places contextual information like this into a per-thread global during the execution of each task. As of Version 1.13 this is available as follows:

def your_function(...):
    from distributed.worker importthread_statekey= thread_state.keyfuture= e.submit(your_function, ...)

Post a Comment for "Dask-distributed. How To Get Task Key Id In The Function Being Calculated?"