Database Replication Considerations

This topic describes the behavior of certain Snowflake features in secondary databases and provides general guidance for working with replicated objects and data.

In this Topic:

This section describes the behavior of certain Snowflake features in secondary databases and provides general guidance for working with replicated objects and data.

Replication and Automatic Clustering

In the primary database, Snowflake monitors clustered tables using Automatic Clustering and reclusters them as needed. As part of a refresh operation, clustered tables are replicated to a secondary database with the current sorting of the table micro-partitions. As such, reclustering is not performed again on the clustered tables in the secondary database, which would be redundant.

If a secondary database contains clustered tables and the database is promoted to become the primary database, Snowflake begins Automatic Clustering of the tables in this database while simultaneously suspending the monitoring of clustered tables in the previous primary database.

See Replication and Materialized Views (in this topic) for information about Automatic Clustering for materialized views.

Replication and Materialized Views

In the primary database, Snowflake performs automatic background maintenance of materialized views. When a base table changes, all materialized views defined on the table are updated by a background service that uses compute resources provided by Snowflake. In addition, if Automatic Clustering is enabled for a materialized view, then the view is monitored and reclustered as necessary in the primary database.

A refresh operation replicates the materialized view definitions to a secondary database; however, the materialized view data is not replicated, meaning some or all of the data in the materialized views could become out of date.

To perform automatic background maintenance of materialized views in a secondary database, explicitly set AUTO_REFRESH_MATERIALIZED_VIEWS_ON_SECONDARY = TRUE on the secondary database either when you create the secondary database (using CREATE DATABASE … AS REPLICA OF) or later (using ALTER DATABASE). If Automatic Clustering is enabled for a materialized view in the primary database, then setting AUTO_REFRESH_MATERIALIZED_VIEWS_ON_SECONDARY = TRUE on a secondary database also enables automatical monitoring and reclustering of the materialized view in the secondary database.

Note

Materialized view maintenance charges are billed to the account where the AUTO_REFRESH_MATERIALIZED_VIEWS_ON_SECONDARY parameter is enabled for a secondary database.

Time Travel

Querying tables and views in a secondary database using Time Travel can produce different results than when executing the same query in the primary database.

Historical Data

Historical data available to query in a primary database using Time Travel is not replicated to secondary databases.

For example, suppose data is loaded continuously into a table every 10 minutes using Snowpipe, and a secondary database is refreshed every hour. The refresh operation only replicates the latest version of the table. While every hourly version of the table within the retention window is available for query using Time Travel, none of the iterative versions within each hour (the individual Snowpipe loads) are available.

Data Retention Period

The data retention period for tables in a secondary database begins when the secondary database is refreshed with the DML operations (i.e. changing or deleting data) written to tables in the primary database.

Replication and Cloning

Cloned objects are replicated physically rather than logically to secondary databases. That is, cloned tables in a standard database do not contribute to the overall data storage unless or until DML operations on the clone add to or modify existing data. However, when a cloned table is replicated to a secondary database, the physical data is also replicated, increasing the data storage usage for your account.

References to Objects in Another Database

Carefully analyze whether views or table constraints in a primary database reference another database.

Views

Views that reference any object in another database (e.g. table columns, other views, UDFs, or stages) can be replicated because this type of reference is name based. Name-based references do not cause replication to fail; however, queries on the view in secondary databases will fail if the other database(s) are not replicated in the same region.

For example, suppose view V1 in database D1 references tables T1 and T2 in databases D1 and D2, respectively. To successfully query view V1 in the secondary database D1, secondary database D2 must also exist in the account (e.g. as another secondary database). In addition, for consistent query results with the primary databases, secondary databases D1 and D2 must be refreshed at the same time.

Constraints

Currently, if foreign keys reference a primary/unique key in another database, the replication will fail. That is because constraint references are ID-based. A database snapshot cannot resolve ID-based references to objects outside the database.

Historical Usage Data

Historical usage data for activity in the primary database is not replicated to secondary databases. Each account has its own query history, login history, etc.

Historical usage data includes the query data returned by the following Information Schema table functions or Account Usage views:

  • COPY_HISTORY

  • LOGIN_HISTORY

  • QUERY_HISTORY

  • etc.