Overview of Warehouses¶
Warehouses are required for queries, as well as all DML operations, including loading data into tables. A warehouse is defined by its size, as well as the other properties that can be set to help control and automate warehouse activity.
Warehouses can be started and stopped at any time. They can also be resized at any time, even while running, to accommodate the need for more or less compute resources, based on the type of operations being performed by the warehouse.
In this Topic:
- Warehouse Size
- Auto-suspension and Auto-resumption
- Query Processing and Concurrency
- Warehouse Usage in Sessions
Size specifies the number of servers that comprise each cluster in a warehouse. Snowflake supports the following warehouse sizes:
|Warehouse Size||Servers / Cluster||Credits / Hour||Credits / Second||Notes|
|X-Small||1||1||0.0003||Default size for warehouses created using CREATE WAREHOUSE.|
|X-Large||16||16||0.0044||Default for warehouses created in the web interface.|
Impact on Credit Usage and Billing¶
As shown in the above table, there is a one-to-one correspondence between the number of servers in a warehouse cluster and the number of credits the cluster consumes (and is, therefore, billed) for each full hour that the warehouse runs; however, note that Snowflake utilizes per-second billing (with a 60-second minimum each time the warehouse starts) so warehouses are billed only for the credits they actually consume.
The total number of credits billed depends on how long the warehouse runs continuously. For comparison purposes, the following table shows the billing totals for three different size warehouses based on their running time (totals rounded to the nearest 1000th of a credit):
|Running Time||Credits . (X-Small)||Credits . (X-Large)||Credits . (3X-Large)|
For a multi-cluster warehouse, the number of credits billed is calculated based on the number of servers per cluster and the number of clusters that run within the time period.
For example, if a 3X-Large multi-cluster warehouse runs 1 cluster for one full hour and then runs 2 clusters for the next full hour, the total number of credits billed would be 192 (i.e. 64 + 128).
Multi-cluster warehouses are an Enterprise Edition feature.
Impact on Data Loading¶
Increasing the size of a warehouse does not always improve data loading performance. Data loading performance is influenced more by the number of files being loaded (and the size of each file) than the size of the warehouse.
Unless you are bulk loading a large number of files concurrently (i.e. hundreds or thousands of files), a smaller warehouse (Small, Medium, Large) is generally sufficient. Using a larger warehouse (X-Large, 2X-Large, etc.) will consume more credits and may not result in any performance increase.
For more data loading tips and guidelines, see Data Loading Considerations.
Impact on Query Processing¶
The size of a warehouse can impact the amount of time required to execute queries submitted to the warehouse, particularly for larger, more complex queries. In general, query performance scales linearly with warehouse size because additional compute resources are provisioned with each size increase.
If queries processed by a warehouse are running slowly, you can always resize the warehouse to provision more servers. The additional servers do not impact any queries that are already running, but they are available for use by any queries that are queued or newly submitted.
Larger is not necessarily faster for small, basic queries.
For more warehouse tips and guidelines, see Warehouse Considerations.
Auto-suspension and Auto-resumption¶
A warehouse can be set to automatically resume or suspend, based on activity:
- If auto-suspend is enabled, the warehouse is automatically suspended if the warehouse is inactive for the specified period of time.
- If auto-resume is enabled, the warehouse is automatically resumed when any statement that requires a warehouse is submitted and the warehouse is the current warehouse for the session.
These properties can be used to simplify and automate your monitoring and usage of warehouses to match your workload. Auto-suspend ensures that you do not leave a warehouse running (and consuming credits) when there are no incoming queries. Similarly, auto-resume ensures that the warehouse starts up again as soon as it is needed.
Auto-suspend and auto-resume apply only to the entire warehouse and not to the individual clusters in the warehouse. For a multi-cluster warehouse:
- Auto-suspend only occurs when the minimum number of clusters is running and there is no activity for the specified period of time. The minimum is typically 1 (cluster), but could be more than 1.
- Auto-resume only applies when the entire warehouse is suspended (i.e. no clusters are running).
Query Processing and Concurrency¶
The number of queries that a warehouse can concurrently process is determined by the size and complexity of each query. As queries are submitted, the warehouse calculates and reserves the compute resources needed to process each query. If the warehouse does not have enough remaining resources to process a query, the query is queued, pending resources that become available as other running queries complete.
Snowflake provides some object-level parameters that can be set to help control query processing and concurrency:
If queries are queuing more than desired, another warehouse can be created and queries can be manually redirected to the new warehouse. In addition, resizing a warehouse can enable limited scaling for query concurrency and queuing; however, warehouse resizing is primarily intended for improving query performance.
To enable fully automated scaling for concurrency, Snowflake recommends multi-cluster warehouses, which provide essentially the same benefits as creating additional warehouses and redirecting queries, but without requiring manual intervention.
Multi-cluster warehouses are an Enterprise Edition feature.
Warehouse Usage in Sessions¶
When a session is initiated in Snowflake, the session does not, by default, have a warehouse associated with it. Until a session has a warehouse associated with it, queries cannot be submitted within the session.
Default Warehouse for Users¶
To facilitate querying immediately after a session is initiated, Snowflake supports specifying a default warehouse for each individual user. The default warehouse for a user is used as the warehouse for all sessions initiated by the user.
Default Warehouse for Client Utilities/Drivers/Connectors¶
In addition to default warehouses for users, any of the Snowflake clients (SnowSQL, JDBC driver, ODBC driver, Python connector, etc.) can have a default warehouse:
- SnowSQL supports both a configuration file and command line option for specifying a default warehouse.
- The drivers and connectors support specifying a default warehouse as a connection parameter when initiating a session.
For more information, see Connecting to Snowflake.
Precedence for Warehouse Defaults¶
When a user connects to Snowflake and start a session, Snowflake determines the default warehouse for the session in the following order:
Default warehouse for the user,
» overridden by…
Default warehouse in the configuration file for the client utility (SnowSQL, JDBC driver, etc.) used to connect to Snowflake (if the client supports configuration files),
» overridden by…
Default warehouse specified on the client command line or through the driver/connector parameters passed to Snowflake.
In addition, the default warehouse for a session can be changed at any time by executing the USE WAREHOUSE command within the session.