Dask-distributed. How To Get Task Key Id In The Function Being Calculated?
My computations with dask.distributed include creation of intermediate files whose names include UUID4, that identify that chunk of work. pairs = '{}\n{}\n{}\n{}'.format(list1
Solution 1:
There are two ways to approach the problem:
- You determine the uuid and pass it to Dask (implemented)
- Dask determines the uuid and passes it to your function (not implemented, but possible)
You pass the uuid to Dask
Functions like .submit
accept a key=
keyword argument where you can specify the key that you want used
>>>e.submit(inc, 1, key='inc-12345')
<Future: status: pending, key: inc-12345>
Similarly dask.delayed functions support a dask_key_name
keyword argument
>>>value = delayed(inc)(1, dask_key_name='inc-12345')
You get the key from Dask
The scheduler places contextual information like this into a per-thread global during the execution of each task. As of Version 1.13 this is available as follows:
def your_function(...):
from distributed.worker importthread_statekey= thread_state.keyfuture= e.submit(your_function, ...)
Post a Comment for "Dask-distributed. How To Get Task Key Id In The Function Being Calculated?"