This topic provides important considerations when cloning objects in Snowflake, particularly databases, schemas, and non-temporary tables. Factors such as DDL and DML transactions (on the source object), Time Travel, and data retention periods can affect the object clone.
In this Topic:
Access Control Privileges for Cloned Objects¶
A cloned object does not retain any granted privileges on the source object itself (i.e. clones do not automatically have the same privileges as their sources). A system administrator or the owner of the cloned object must explicitly grant any required privileges to the newly-created clone.
However, if the source object is a database or schema, for child objects contained in the source, the clone replicates all granted privileges on the corresponding child objects:
For databases, contained objects include schemas, tables, views, etc.
For schemas, contained objects include tables, views, etc.
Rules for Cloning Stages¶
The following rules apply to cloning stages or objects that contain stages (i.e. databases and schemas):
Individual external named stages can be cloned; internal named stages cannot be cloned.
When cloning a database or schema:
External named stages that were present in the source when the cloning operation started are cloned.
Tables are cloned, which means their internal stages are also cloned.
Internal named stages are not cloned.
Regardless of how a stage was cloned, the clone does not include any of the files from the source. i.e. all cloned stages are empty.
Cloning and Snowpipe¶
When a database or schema is cloned, any contained pipes that reference an internal (i.e. Snowflake) stage are not cloned.
However, any pipes that reference an external stage are cloned. This results in the following behavior:
If the target table is fully qualified in the COPY statement in the pipe definition (in the form of
schema_name.table_name), this can result in duplicate data getting loaded into the target table in the source database or schema by each pipe.
If the target table is not fully qualified in the pipe definition, then the data is loaded into the target table (e.g.
mytable) in the source and cloned databases/schemas.
Cloning and Default Sequences¶
When a table with a column with a default sequence is cloned, the cloned table still references the original sequence object. If you do not want the new table to continue using the old sequence, run the following command:
ALTER TABLE <table_name> ALTER COLUMN <column_name> SET DEFAULT <new_sequence>.nextval;
Cloning and Streams¶
Currently, when a database or schema that contains source tables and streams is cloned, any unconsumed records in the streams (in the clone) are inaccessible. This behavior is consistent with Time Travel for tables. If a table is cloned, historical data for the table clone begins at the time/point when the clone was created.
Cloning and Tasks¶
When a database or schema that contains tasks is cloned, the tasks in the clone are suspended by default. The tasks can be resumed individually (using ALTER TASK … RESUME).
Impact of DDL on Cloning¶
Cloning is fast, but not instantaneous, particularly for large objects (e.g. tables). As such, if DDL statements are executed on source objects (e.g. renaming tables in a schema) while the cloning operation is in progress, the changes may not be represented in the clone. This is because DDL statements are atomic and not part of multi-statement transactions.
Furthermore, Snowflake does not record which object names were present when the cloning operation started and which names changed. As such, DDL statements that rename (or drop and recreate) source child objects compete with any in-progress cloning operations and can cause name conflicts.
In the following example, the
t_sales table is dropped and another table is altered and given the same name as the dropped table while
the parent database is being cloned, producing an error:
CREATE OR REPLACE DATABASE staging_sales CLONE sales; DROP TABLE sales.public.t_sales; ALTER TABLE sales.public.t_sales_20170522 RENAME TO sales.public.t_sales; 002002 (42710): None: SQL compilation error: Object 'T_SALES' already exists.
To avoid conflicts in name resolution during a cloning operation, we suggest refraining from renaming objects to a name previously used by a dropped object until cloning is completed.
Impact of DML and Data Retention on Cloning¶
The DATA_RETENTION_TIME_IN_DAYS parameter specifies the number of days for which Snowflake retains historical data for performing
Time Travel actions on an object. Because the data retained for Time Travel incurs storage costs at the table-level, some users set this parameter
0 for some tables, effectively disabling data retention for these tables (i.e. when the value is set to
0, Time Travel data
retained for DML transactions is purged, incurring negligible additional storage costs).
Cloning operations require time to complete, particularly for large tables. During this period, DML transactions can alter the data in a source
table. Subsequently, Snowflake attempts to clone the table data as it existed when the operation began. However, if data is purged for DML
transactions that occur during cloning (because the retention time for the table is
0), the data is unavailable to complete the operation,
producing an error similar to the following:
ProgrammingError occured: "000707 (02000): None: Data is not available." with query id None
As a workaround, we recommend either of the following best practices when cloning an object:
Refrain, if possible, from executing DML transactions on the source object (or any of its children) until after the cloning operation completes.
If this isn’t possible, prior to starting cloning, set
DATA_RETENTION_TIME_IN_DAYS=1for all tables in the schema (or database if you are cloning an entire database). Once the operation completes, remember to reset the parameter value back to
0for those tables in the source, if desired.
You might also want to set the value to
0for the cloned tables (if you plan to make DML changes to the cloned tables and do not wish to incur additional storage costs for Time Travel on the tables).
Cloning Using Time Travel (Databases, Schemas, and Tables Only)¶
This section describes considerations when using Time Travel to clone objects at a specific time/point in the past.
Cloning of Historical Objects¶
If the source object did not exist at the time/point specified in the AT | BEFORE clause, an error is returned.
In the following example, a CREATE TABLE … CLONE statement attempts to clone the source table at a point in the past (30 minutes prior) when it didn’t exist:
CREATE TABLE t_sales (numeric integer) data_retention_time_in_days=1; CREATE OR REPLACE TABLE sales.public.t_sales_20170522 CLONE sales.public.t_sales at(offset => -60*30); 002003 (02000): SQL compilation error: Object 'SALES.PUBLIC.T_SALES' does not exist.
Also note that any child object that did not exist at the specified time/point is not cloned.
Cloning of Historical Object Metadata¶
A cloned object inherits the definition of the source object at the time the statement is executed. The definition includes object names, comments, table columns, and so on.
In the following example, table
t_sales is renamed
t_sales_20170522, and then data is inserted into the renamed table. The schema
containing the table is cloned at a time 30 minutes prior, before the table was renamed and the data inserted. The clone contains a table with its
current name, not the previous name, but with a snapshot of the data at the specified time:
CREATE OR REPLACE TABLE t_sales (numeric integer) data_retention_time_in_days=1; INSERT INTO t_sales (numeric) VALUES (1),(2),(3); ALTER TABLE sales.public.t_sales RENAME TO sales.public.t_sales_20170522; INSERT INTO t_sales_20170522 (numeric) VALUES (4); CREATE OR REPLACE SCHEMA sales.private CLONE sales.public at(offset => -60*30); USE SCHEMA sales.private; SHOW TABLES; +-------------------------------+------------------+---------------+-------------+-------+---------+------------+------+-------+----------+----------------+ | created_on | name | database_name | schema_name | kind | comment | cluster_by | rows | bytes | owner | retention_time | |-------------------------------+------------------+---------------+-------------+-------+---------+------------+------+-------+----------+----------------| | 2017-05-24 10:15:07.110 -0700 | T_SALES_20170522 | SALES | PRIVATE | TABLE | | | 4 | 1024 | SYSADMIN | 1 | +-------------------------------+------------------+---------------+-------------+-------+---------+------------+------+-------+----------+----------------+ SELECT * FROM T_SALES_20170522; +---------+ | NUMERIC | |---------| | 1 | | 2 | | 3 | +---------+