This tutorial shows you how to create a schedule for your workflow, and then submit the schedule to Treasure Data.
The workflow functionality is also partially described in the onboarding tutorial. You might want to complete the onboarding tutorial, before completing this tutorial if you haven’t yet installed and started using TD Workflows.
Add schedule information to your workflow
To add a schedule to your workflow, add the following at the top of your workflow file.
timezone: UTC schedule: daily>: 07:00:00
The default value is UTC. Specify timezone using tz database time zones. Some examples of valid time zones are: America/Los_Angeles, Europe/Berlin, Asia/Tokyo
You can choose one of following options:
|minutes_interval>: M||Run this job every M minutes||minutes_interval>: 30
This example specifies that the job runs every 30 minutes. For example, if the job started at 6:10 am., then the job runs again at 6:40, 7:10, 7:40 and so on.
|hourly>: MM:SS||Run this job every hour at this MM:SS
Hourly, +MM mins SS secs
Hourly, +25 minutes
This example specifies that the job runs every hour, 25 minutes into the hour. For example, 8:25, 9:25, 10:25 and so on.
|daily>: HH:MM:SS||Run this job every day at this HH:MM:SS
Daily, @HH:MM:SS AM/PM
This example specifies that the job runs every day at 1:30 p.m.
Tip: If you want to run your job at midnight each day, you specify 00:00. If you want to specify 30 minutes past midnight, you enter 00:30. If you want to specify 30 minutes after the noon hour, you enter 12:30.
|weekly>: DDD,HH:MM:SS||Run this job every week on DDD at HH:MM:SS
Every DDD, @HH:MM:SS AM/PM
This example specifies every week on Sunday, run the job at 9:00 a.m.
|monthly>: D,HH:MM:SS||Run this job every month on D at HH:MM:SS
every D of month, @HH:MM:SS AM/PM
This example specifies on the first day of each month, run the job at 9:15 a.m. If you wanted to specify 9:15 p.m., you type:
|cron> CRON||Use cron format for complex scheduling||cron>: 42 4 1 * *
This example specifies 42 minutes, 4 hours and day 1 of the month.
Tip: You are not required to specify hours, minutes, or seconds (HH, MM or SS). You might even save some processing time if you omit HH, MM and SS. For example, if you specify
daily then the job runs once per day. The job runs and then 24 hours later, runs again. If you specify
weekly then the job runs once per week. The job runs and then 7 days later, runs again at the same time of day that the job ran initially.
Submit the workflow to Treasure Data to run on the scheduled basis
Now that you’ve created a workflow, you want it to run as you scheduled. Run this command to submit the workflow to Treasure Data:
$ td wf push <project_name>
That’s it! Now your workflow will run at the scheduling interval you set.
List the workflows registered on Treasure Data
$ td wf workflows
Find out what workflows are scheduled to run next on Treasure Data
$ td wf schedules
Incremental Processing Workflows
Refer to the following topics if you want to create workflows that process data incrementally:
- Tutorial: Bringing data into your system incrementally
- Referencing a data connector to transfer data incrementally
If you have any feedback, we welcome hearing your thoughts on our TD Workflows ideas forum.
Also, if you have any ideas or feedback on the tutorial itself, provide your comments here.