Overview of Data Unloading

Similar to data loading, Snowflake supports bulk export (i.e. unload) of data from a database table into flat, delimited text files.

In this Topic:

Data Unloading Process

Unloading data overview

As illustrated in the diagram above, the process for unloading data into files is the same as the loading process, except in reverse:

Step 1:

Use the COPY INTO location command to copy the data from the Snowflake database table into one or more files in a Snowflake or external staging location:

  • For Snowflake staging location, specify the location (user or table) where the files are written.
  • For an external staging location, specify the URL and security credentials for the S3 bucket where the files are written or use a stage object that references the external location.

This step requires an active virtual warehouse. The warehouse provides the compute resources to write rows from the table.

Step 2:

Download the file from the location:

  • From a Snowflake location, use the GET command to download the data file(s).
  • From S3, use the interfaces/tools provided by Amazon S3 to get the data file(s).

Data Unloading Using Queries

Snowflake supports specifying a SELECT statement instead of a table in the COPY INTO location command. The results of the query are written to one or more files as specified in the command and the file(s) are stored in the specified location (internal or external).

SELECT statements in COPY INTO location support the full syntax and semantics of Snowflake SQL queries, including JOIN clauses, which enables downloading data from multiple tables.

Data Unloading into Single or Multiple Files

The COPY INTO location command provides a copy option (SINGLE) for unloading data into a single file or multiple files. The default is SINGLE = FALSE (i.e. unload into multiple files).

Snowflake assigns each file a unique name. The location path specified for the command can contain a filename prefix that is assigned to all the data files generated. If a prefix is not specified, Snowflake prefixes the generated filenames with data_.

Snowflake appends a suffix that ensures each file name is unique across parallel execution threads; e.g. data_stats_0_1_0.

When unloading data into multiple files, use the MAX_FILE_SIZE copy option to specify the maximum size of each file created.