Creating Parallel Tasks in Your TD Workflow
In TD Workflow, you can have tasks run in parallel. By default, tasks are run sequentially. To have TD Workflow tasks run in parallel, you must specify the parallel parameter as _parallel: True. It is also recommended that you use the +group syntax to group the tasks that you want to have run in parallel.
You can define as many tasks as you want to run in parallel, however TD can only run up to 10 separate processing threads at a given time. Tasks can have one of four different states, only one of which is Running. As long as only ten tasks have concurrent states of running each of those tasks is executed in parallel.
Parallel task order example
The following example groups two tasks and has them running in parallel.
+group:_parallel: True +step1: echo>: "hello!" +py_custom_code: docker: image: "digdag/digdagpython:3.6.8stretch" py>: tasks.printMessage +step2: td_run>: my_other_saved_query
Default task order example
For example by default, in the following workflow, the task step1 is executed first and because there is a custom script included in the step, the Python program “tasks.printMessage” is run in a separate Docker container, and when it completes, the subsequent task in step 2 is executed.
+step1: echo>: "hello!" +py_custom_code: docker: image: "digdag/digdagpython:3.6.8stretch" py>: tasks.printMessage +step2: td_run>: my_other_saved_query
TD Workflow runs until all tasks complete. Long running tasks that go beyond the set TTL (Time To Live) are timed out. Even if the task is executing a custom script. Periodically the task is polled to check on the status of the script running in the container and log messages can be seen on the TD Console and CLI using the “td wf log <attemptid>” command.