Categories:
Miscellaneous Functions (System)

SYSTEM$CLUSTERING_INFORMATION

Returns clustering information for a table based on one or more columns in the table.

See also:
SYSTEM$CLUSTERING_RATIO , SYSTEM$CLUSTERING_DEPTH

Syntax

SYSTEM$CLUSTERING_INFORMATION( '<table_name>' , '( <col1> [, <col2> ... ] )' )

Arguments

table_name
Table for which you want to return clustering information.
col1 [ , col2 … ]

Column(s) in the table for which clustering information is returned:

  • For tables with no clustering keys, this argument is required. If this argument is omitted, an error is returned.
  • For tables with clustering keys, this argument is optional; if the argument is omitted, Snowflake uses the defined clustering keys to return clustering information.

Note

You can use this argument to return clustering information for any columns in the table, regardless of the clustering keys defined for the table.

Usage Notes

  • All arguments are strings, i.e. they must be enclosed in single quotes.

Output

The function returns a JSON object containing the following name/value pairs:

cluster_by_keys
Columns in table used to return clustering information; can be any columns in the table.
total_partition_count
Total number of micro-partitions that comprise the table.
total_constant_partition_count
Total number of micro-partitions for which the value of the specified columns have reached a constant state (i.e. the micro-partitions will not benefit significantly from reclustering). The number of constant micro-partitions in a table has an impact on pruning for queries. The higher the number, the more micro-partitions can be pruned from queries executed on the table, which has a corresponding impact on performance.
average_overlaps
Average number of overlapping micro-partitions for each micro-partition in the table. A high number indicates the table is not well-clustered.
average_depth

Average overlap depth of each micro-partition in the table. A high number indicates the table is not well-clustered.

This value is also returned by SYSTEM$CLUSTERING_DEPTH.

partition_depth_histogram

A histogram depicting the distribution of overlap depth for each micro-partition in the table. The histogram contains buckets with widths:

  • 0 to 16 with increments of 1.
  • For buckets larger than 16, increments of twice the width of the previous bucket (e.g. 32, 64, 128, ….).

For more information about micro-partition overlap and depth and their impact on query pruning, see Understanding Snowflake Table Structures.

Examples

Return the clustering information for a table using two columns in the table:

SELECT SYSTEM$CLUSTERING_INFORMATION('test2', '(col1, col3)');

+--------------------------------------------------------------+
| SYSTEM$CLUSTERING_INFORMATION('TEST2', '(COL1, COL3)')       |
|--------------------------------------------------------------|
| {                                                            |
|   "cluster_by_keys" : "(COL1, COL3)",                        |
|   "total_partition_count" : 1156,                            |
|   "total_constant_partition_count" : 0,                      |
|   "average_overlaps" : 117.5484,                             |
|   "average_depth" : 64.0701,                                 |
|   "partition_depth_histogram" : {                            |
|     "00000" : 0,                                             |
|     "00001" : 0,                                             |
|     "00002" : 3,                                             |
|     "00003" : 3,                                             |
|     "00004" : 4,                                             |
|     "00005" : 6,                                             |
|     "00006" : 3,                                             |
|     "00007" : 5,                                             |
|     "00008" : 10,                                            |
|     "00009" : 5,                                             |
|     "00010" : 7,                                             |
|     "00011" : 6,                                             |
|     "00012" : 8,                                             |
|     "00013" : 8,                                             |
|     "00014" : 9,                                             |
|     "00015" : 8,                                             |
|     "00016" : 6,                                             |
|     "00032" : 98,                                            |
|     "00064" : 269,                                           |
|     "00128" : 698                                            |
|   }                                                          |
| }                                                            |
+--------------------------------------------------------------+

This example indicates that the test2 table is not well-clustered for the following reasons:

  • Zero constant micro-partitions out of 1156 total micro-partitions.
  • High average of overlapping micro-partitions.
  • High average of overlap depth across micro-partitions.
  • Most of the micro-partitions are grouped at the lower-end of the histogram, with the majority of micro-partitions having an overlap depth between 64 and 128.