Executing SQL Statements on a Schedule Using Tasks¶
User-defined tasks allow scheduled execution of SQL statements. Tasks run according to a specified execution configuration, using any combination of a set interval and/or a flexible schedule using a subset of familiar cron utility syntax. Tasks currently run in user-specified warehouses but will eventually run in a serverless environment using Snowflake-supplied compute resources.
In this Topic:
Overview of Tasks¶
Currently, a task can execute a single SQL DML statement or a call to a stored procedure when it runs. DDL statements are not supported outside of a stored procedure.
Tasks can be combined with table streams for continuous ELT workflows to process recently changed table rows. Streams ensure exactly once semantics for new or changed data in a table.
Tasks can also be used independently to, e.g.:
Generate periodic reports by inserting or merging rows into a report table.
Snowflake ensures only one instance of a task with a schedule is executed at a given time. If a task is still running when the next scheduled execution time occurs, then that scheduled time is skipped.
Task Scheduling and Daylight Saving Time¶
The cron expression in a task definition supports specifying a time zone. A scheduled task runs according to the specified cron expression in the local time for a given time zone. Special care should be taken with regard to scheduling tasks for time zones that recognize daylight saving time. Tasks scheduled during specific times on days when the transition from standard time to daylight saving time (or the reverse) occurs can have unexpected behaviors.
During the autumn change from daylight saving time to standard time, a task scheduled to start at 1 AM in the America/Los_Angeles time zone (i.e.
0 1 * * * America/Los_Angeles) would run twice: once at 1 AM and then again when 1:59:59 AM shifts to 1:00:00 AM local time. That is, there are two points in time when the local time is 1 AM.
During the spring change from standard time to daylight saving time, a task scheduled to start at 2 AM in the America/Los_Angeles time zone (i.e.
0 2 * * * America/Los_Angeles) would not run at all because the local time shifts from 1:59:59 AM to 3:00:00 AM. That is, there is no point during that day when the local time is 2 AM.
To avoid unexpected task executions due to daylight saving time, either:
Do not schedule tasks to run at a specific time between 1 AM and 3 AM (daily, or on days of the week that include Sundays), or
Manually adjust the cron expression for tasks scheduled during those hours twice each year to compensate for the time change due to daylight saving time.
Simple Tree of Tasks¶
Users can define a simple tree-like structure of tasks that starts with a root task and is linked together by task dependencies. The current implementation supports a single path between any two nodes; i.e. an individual task can have only a single predecessor (i.e. parent) task. This differs from a Directed Acyclic Graph (DAG) structure, in which a single node can have multiple parents.
A predecessor task can be defined when creating a task (using CREATE TASK … AFTER). The root task in the tree must have a defined schedule, while each of the other tasks in the tree have a defined predecessor to link them together. A task is limited to a single predecessor task; however, a task can have a maximum of 100 child tasks (i.e. other tasks that identify the task as a predecessor).
Currently, we cannot guarantee that only one instance of a task with a defined predecessor task is running at a given time.
A brief lag occurs after a parent task finishes running and any child task is executed.
For more information, including the current limitations of this feature, see CREATE TASK.
Setting Session Parameters for Tasks¶
You can set session parameters for the session in which a task runs. To do so, modify an existing task and set the desired parameter values (using ALTER TASK …
SET session_parameter = value[, session_parameter = value ... ]).
A task supports all session parameters. For the complete list, see Parameters.
Note that a task does not support account or user parameters.
Viewing the Task History for Your Account¶
Account administrators (i.e. users with the ACCOUNTADMIN role) or task owners (i.e. users with roles that have the OWNERSHIP privilege on a task) can use SQL to view the task history within a specified date range.
To view the task history:
To support creating and managing tasks, Snowflake provides the following set of special DDL commands:
In addition, providers can view, grant, or revoke access to the necessary database objects for ELT using the following standard access control DDL:
To support retrieving information about tasks, Snowflake provides the following set of SQL functions:
Required Access Privileges¶
Creating, managing, and executing tasks requires a role with a minimum of the following role permissions:
Required to run any tasks the role owns. Revoking the EXECUTE TASK privilege on a role prevents all subsequent task runs from starting under that role.
USAGE, CREATE TASK
In addition, the role must have the permissions required to run the SQL statement executed by the task.
Creating a Task Administrator Role¶
A task is suspended by default when it is created. The task owner (i.e. the role with the OWNERSHIP privilege on the task) must execute ALTER TASK … RESUME to allow the task to run based on the parameters specified in the task definition.
Executing the ALTER TASK command also requires the task owner to have the global EXECUTE TASK privilege. Note that only account administrators (users with the ACCOUNTADMIN role) can grant the EXECUTE TASK privilege to a role.
For ease of use, we recommend creating a custom role (e.g. TASKADMIN) and assigning the EXECUTE TASK privilege to this role. Any role that can grant privileges (e.g. SECURITYADMIN or any role with the MANAGE GRANTS privilege) can then grant this custom role to any task owner role to allow altering their own tasks. To remove the ability for the task owner role to execute the task, it is only necessary to revoke this custom role from the task owner role. Note that if you choose not to create this custom role, an account administrator must revoke the EXECUTE TASK privilege from the task owner role.
For example, create a custom role name TASKADMIN and grant that role the EXECUTE TASK privilege. Assign the TASKADMIN role to a task owner role named MYROLE:
USE ROLE securityadmin; CREATE ROLE taskadmin; -- set the active role to ACCOUNTADMIN before granting the EXECUTE TASK privilege to the new role USE ROLE accountadmin; GRANT EXECUTE TASK ON ACCOUNT TO ROLE taskadmin; -- set the active role to SECURITYADMIN to show that this role can grant a role to another role USE ROLE securityadmin; GRANT ROLE taskadmin TO ROLE myrole;
For more information on creating custom roles and role hierarchies, see Configuring Access Control.
Dropping a Task Owner Role¶
When the owner role of a given task (i.e. the role with the OWNERSHIP privilege on the task) is deleted, the task is “re-possessed” by the owner of the role that is deleted. This ensures that ownership moves to a role that is closer to the root of the role hierarchy. When a task is re-possessed, it is automatically paused, i.e., all executions currently in flight complete processing, but new executions will not be scheduled until the task is resumed explicitly by the new owner. The rationale for this is to prevent a user with access to a particular role from leaving behind tasks that suddenly execute with higher permissions when the role is removed.
If the role that a running task is executing under is dropped while the task is running, the task completes processing under the dropped role.
This section provides a high-level overview of the task setup workflow.
Complete the steps in Creating a Task Administrator Role (in this topic) to create a role that can be used to execute the commands in the following steps.
Create a task using CREATE TASK. The task is suspended by default.
Verify the SQL statement or stored procedure that you will reference in a task executes as expected before you create the task. Tasks are intended to automate SQL statements and stored procedures that have already been tested thoroughly.
Execute ALTER TASK … RESUME to allow the task to run based on the parameters specified in the task definition.
This section describes a methodical approach to troubleshooting tasks that do not run as expected.
Task Did Not Run¶
Step 1: Verify That the Task Did Not Run¶
Query the TASK_HISTORY table function to verify the task did not run. It is possible that the task ran successfully but the SQL statement in the task definition failed. In particular, note the scheduled and completed times, as well as any error code and message.
If the task has a parent task (in a tree of tasks), verify whether the parent task completed successfully.
Step 2: Verify the Task was Resumed¶
To resume an individual task, execute ALTER TASK … RESUME. To recursively enable all dependent tasks tied to a root task, query the SYSTEM$TASK_DEPENDENTS_ENABLE function rather than enabling each task individually.
While you are reviewing the task details, if the task has a schedule, also check the cron expression. Verify that at least one occurrence of the scheduled time has passed.
Step 3: Verify the Permissions Granted to the Task Owner¶
Verify the task owner (i.e. the role that has the OWNERSHIP privilege on the task) has the global EXECUTE TASK privilege (using SHOW GRANTS TO ROLE
role_name). This privilege is required to execute tasks.
If the role does not have the EXECUTE TASK privilege, assign the privilege as an account administrator (user with the ACCOUNTADMIN role), e.g.:
USE ROLE accountadmin; GRANT EXECUTE TASK ON ACCOUNT TO ROLE <role_name>;
Task Timed out or Exceeded the Schedule Window¶
Currently, there is a one-hour limit on a single run of a task. This limitation was implemented as a safeguard against non-terminating tasks. Query the TASK_HISTORY table function. If the task was canceled or exceeded the window scheduled for the task, the cause is often an undersized warehouse. Review the warehouse size and consider increasing it to fit within the schedule window or the one-hour limit.
A larger warehouse may not help if there are query parallelization issues. Consider looking at alternate ways to rewrite the SQL statement run by the task.