Categories:

Data Loading / Unloading DDL

CREATE STAGE

Creates a new named internal or external stage to use for loading data from files into Snowflake tables and unloading data from tables into files:

Internal stage

Stores data files internally within Snowflake. Internal stages can be either permanent or temporary. For more details, see Choosing a Stage for Local Files.

External stage

References data files stored in a location outside of Snowflake. Currently, AWS S3 buckets or Microsoft Azure containers are supported. The storage location (i.e. AWS S3 or Microsoft Azure) can be either private/protected or public. For more details, see Creating an S3 Stage or Creating an Azure Stage.

Note

By default, CREATE STAGE creates permanent internal stages.

See also:

ALTER STAGE , SHOW STAGES

PUT , COPY INTO <table>

COPY INTO <location> , GET

In this Topic:

Syntax

-- Internal stage
CREATE [ OR REPLACE ] [ TEMPORARY ] STAGE [ IF NOT EXISTS ] <internal_stage_name>
  [ FILE_FORMAT = ( { FORMAT_NAME = '<file_format_name>' | TYPE = { CSV | JSON | AVRO | ORC | PARQUET | XML } [ formatTypeOptions ] ) } ]
  [ COPY_OPTIONS = ( copyOptions ) ]
  [ COMMENT = '<string_literal>' ]

-- External stage
CREATE [ OR REPLACE ] [ TEMPORARY ] STAGE [ IF NOT EXISTS ] <external_stage_name>
    externalStageParams
  [ FILE_FORMAT = ( { FORMAT_NAME = '<file_format_name>' | TYPE = { CSV | JSON | AVRO | ORC | PARQUET | XML } [ formatTypeOptions ] ) } ]
  [ COPY_OPTIONS = ( copyOptions ) ]
  [ COMMENT = '<string_literal>' ]

Where:

externalStageParams (for AWS S3 bucket)

externalStageParams ::=
  URL = 's3://<bucket>[/<path>/]'

  [ { STORAGE_INTEGRATION = <integration_name> } | { CREDENTIALS = ( {  { AWS_KEY_ID = `<string>` AWS_SECRET_KEY = `<string>` [ AWS_TOKEN = `<string>` ] } | AWS_ROLE = `<string>`  } ) ) }` ]
  [ ENCRYPTION = ( [ TYPE = AWS_CSE ] [ MASTER_KEY = '<string>' ] |
                   [ TYPE = AWS_SSE_S3 ] |
                   [ TYPE = AWS_SSE_KMS [ KMS_KEY_ID = '<string>' ] |
                   [ TYPE = NONE ] ) ]

externalStageParams (for Google Cloud Storage bucket)

externalStageParams ::=
  URL = 'gcs://<bucket>[/<path>/]'
  [ STORAGE_INTEGRATION = <integration_name> ]

externalStageParams (for Microsoft Azure container)

externalStageParams ::=
  URL = 'azure://<account>.blob.core.windows.net/<container>[/<path>/]'
  [ CREDENTIALS = ( [ AZURE_SAS_TOKEN = <string> ] ) ]
  [ ENCRYPTION = ( [TYPE = { AZURE_CSE | NONE } ] [ MASTER_KEY = '<string>' ] ) ]

formatTypeOptions (type-specific)

formatTypeOptions ::=
-- If TYPE = CSV
     COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | NONE
     RECORD_DELIMITER = '<character>' | NONE
     FIELD_DELIMITER = '<character>' | NONE
     FILE_EXTENSION = '<string>'
     SKIP_HEADER = <integer>
     DATE_FORMAT = '<string>' | AUTO
     TIME_FORMAT = '<string>' | AUTO
     TIMESTAMP_FORMAT = '<string>' | AUTO
     BINARY_FORMAT = HEX | BASE64 | UTF8
     ESCAPE = '<character>' | NONE
     ESCAPE_UNENCLOSED_FIELD = '<character>' | NONE
     TRIM_SPACE = TRUE | FALSE
     FIELD_OPTIONALLY_ENCLOSED_BY = '<character>' | NONE
     NULL_IF = ( '<string>' [ , '<string>' ... ] )
     ERROR_ON_COLUMN_COUNT_MISMATCH = TRUE | FALSE
     VALIDATE_UTF8 = TRUE | FALSE
     EMPTY_FIELD_AS_NULL = TRUE | FALSE
     SKIP_BYTE_ORDER_MARK = TRUE | FALSE
     ENCODING = '<string>' | UTF8
-- If TYPE = JSON
     COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | NONE
     FILE_EXTENSION = '<string>'
     ENABLE_OCTAL = TRUE | FALSE
     ALLOW_DUPLICATE = TRUE | FALSE
     STRIP_OUTER_ARRAY = TRUE | FALSE
     STRIP_NULL_VALUES = TRUE | FALSE
     IGNORE_UTF8_ERRORS = TRUE | FALSE
     SKIP_BYTE_ORDER_MARK = TRUE | FALSE
-- If TYPE = AVRO
     COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | NONE
-- If TYPE = ORC
     -- none (no options are supported)
-- If TYPE = PARQUET
     COMPRESSION = AUTO | LZO | SNAPPY | NONE
     BINARY_AS_TEXT = TRUE | FALSE
     SNAPPY_COMPRESSION = TRUE | FALSE
-- If TYPE = XML
     COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | NONE
     IGNORE_UTF8_ERRORS = TRUE | FALSE
     PRESERVE_SPACE = TRUE | FALSE
     STRIP_OUTER_ELEMENT = TRUE | FALSE
     DISABLE_SNOWFLAKE_DATA = TRUE | FALSE
     DISABLE_AUTO_CONVERT = TRUE | FALSE
     SKIP_BYTE_ORDER_MARK = TRUE | FALSE

copyOptions

copyOptions ::=
     ON_ERROR = { CONTINUE | SKIP_FILE | SKIP_FILE_<num> | SKIP_FILE_<num>% | ABORT_STATEMENT }
     SIZE_LIMIT = <num>
     PURGE = TRUE | FALSE
     RETURN_FAILED_ONLY = TRUE | FALSE
     ENFORCE_LENGTH = TRUE | FALSE
     TRUNCATECOLUMNS = TRUE | FALSE
     FORCE = TRUE | FALSE

Required Parameters

internal_stage_name or . external_stage_name

Specifies the identifier for the stage; must be unique for the schema in which the stage is created.

In addition, the identifier must start with an alphabetic character and cannot contain spaces or special characters unless the entire identifier string is enclosed in double quotes (e.g. "My object"). Identifiers enclosed in double quotes are also case-sensitive.

For more details, see Identifier Syntax.

Note

When creating an external stage, a URL is also required. For more details, see External Stage Parameters (in this topic).

If a URL is not specified, Snowflake creates an internal stage by default.

Optional Parameters

TEMPORARY

Specifies that the stage created is temporary and will be dropped at the end of the session in which it was created. Note:

  • When a temporary external stage is dropped, only the stage itself is dropped; the data files are not removed.

  • When a temporary internal stage is dropped, all of the files in the stage are purged from Snowflake, regardless of their load status. This prevents files in temporary internal stages from using data storage and, consequently, accruing storage charges. However, this also means that the staged files cannot be recovered through Snowflake once the stage is dropped.

    Tip

    If you plan to create and use temporary internal stages, you should maintain copies of your data files outside of Snowflake.

FILE_FORMAT = ( FORMAT_NAME = 'file_format_name' ) or . FILE_FORMAT = ( TYPE = CSV | JSON | AVRO | ORC | PARQUET | XML [ ... ] )

Specifies the file format for the stage, which can be either:

FORMAT_NAME = 'file_format_name'

Specifies an existing named file format to use for the stage. The named file format determines the format type (CSV, JSON, etc.), as well as any other format options, for the data files loaded using this stage. For more details, see CREATE FILE FORMAT.

TYPE = CSV | JSON | AVRO | ORC | PARQUET | XML [ ... ]

Specifies the type of files for the stage:

  • Loading data from a stage (using COPY INTO <table>) accommodates all of the supported format types.

  • Unloading data into a stage (using COPY INTO <location>) accommodates CSV, JSON, or PARQUET.

If a file format type is specified, additional format-specific options can be specified. For more details, see Format Type Options (formatTypeOptions) (in this topic).

Default: TYPE = CSV

Note

FORMAT_NAME and TYPE are mutually exclusive; you can only specify one or the other for a stage.

COPY_OPTIONS = ( ... )

Specifies one (or more) copy options for the stage. For more details, see Copy Options (copyOptions) (in this topic).

COMMENT = 'string_literal'

Specifies a comment for the stage.

Default: No value

External Stage Parameters (externalStageParams)

URL = 'cloud_specific_url'

If this parameter is omitted, Snowflake creates an internal stage

Important

Enclose the URL in single quotes ('') in order for Snowflake to identify the string. If the quotes are omitted, any credentials you supply may be displayed in plain text in the history. We strongly recommend verifying the syntax of the CREATE STAGE statement before you execute it.

The Create Stage wizard in the Snowflake web interface automatically encloses field values in quotation characters, as needed. Using the web interface to create stages is a good option for less-experienced users.

AWS S3

URL = 's3://bucket[/path/]'

Specifies the URL for the external location (existing S3 bucket) used to store data files for loading/unloading, where:

  • bucket is the name of the S3 bucket.

  • path is the optional path (also known as the prefix) that limits the set of files to load. Only files that start with the specified path are included in the data load.

Google Cloud Storage

URL = 'gcs://bucket[/path/]'

Specifies the URL for the external location (existing GCS bucket) used to store data files for loading/unloading, where:

  • bucket is the name of the GCS bucket.

  • path is the optional path (also known as the prefix) that limits the set of files to load. Only files that start with the specified path are included in the data load.

Microsoft Azure

URL = 'azure://account.blob.core.windows.net/container[/path/]'

Specifies the URL for the external location (existing Azure container) used to store data files for loading, where:

  • account is the name of the Azure account (e.g. myaccount).

  • container is the name of the Azure container (e.g. mycontainer).

  • path is the optional folder path prefix for the data files. Specifying a path restricts the set of files loaded to the files with the folder path prefix.

Default: No value (an internal stage is created)

STORAGE_INTEGRATION = integration_name or . CREDENTIALS = ( cloud_specific_credentials )

Required only if the storage location is private/protected; not required for public buckets/containers

AWS S3

STORAGE_INTEGRATION = integration_name>

Specifies the name of the storage integration used to delegate authentication responsibility for external cloud storage to a Snowflake identity and access management (IAM) entity. For more details, see CREATE STORAGE INTEGRATION.

Note

We highly recommend the use of storage integrations. This option avoids the need to supply cloud storage credentials using the CREDENTIALS parameter when creating stages or loading data.

CREDENTIALS = ( AWS_KEY_ID = 'string' AWS_SECRET_KEY = 'string' [ AWS_TOKEN = 'string' ] ) or . CREDENTIALS = ( AWS_ROLE = 'string' )

Specifies the security credentials for connecting to AWS and accessing the private/protected S3 bucket where the files to load/unload are staged. For more information, see Configuring Secure Access to AWS S3.

The credentials you specify depend on whether you associated the Snowflake access permissions for the bucket with an AWS IAM (Identity & Access Management) user or role:

  • IAM user: IAM credentials are required. Temporary (aka “scoped”) credentials are generated by AWS Security Token Service (STS) and consist of three components:

    • AWS_KEY_ID

    • AWS_SECRET_KEY

    • AWS_TOKEN

    All three are required to access a private/protected bucket. After a designated period of time, temporary credentials expire and can no longer be used. You must then generate a new set of valid temporary credentials.

    Important

    The COPY command also allows permanent (aka “long-term”) credentials to be used; however, for security reasons, Snowflake does not recommend using them. If you must use permanent credentials, Snowflake recommends periodically generating new permanent credentials for external stages.

  • IAM role: Omit the security credentials and access keys and, instead, identify the role using AWS_ROLE and specify the AWS role ARN (Amazon Resource Name).

Google Cloud Storage

STORAGE_INTEGRATION = integration_name>

Specifies the name of the storage integration used to delegate authentication responsibility for external cloud storage to a Snowflake identity and access management (IAM) entity. For more details, see CREATE STORAGE INTEGRATION.

Microsoft Azure

CREDENTIALS = ( AZURE_SAS_TOKEN = 'string' )

Specifies the SAS (shared access signature) token for connecting to Azure and accessing the private/protected container where the files containing loaded data are staged. Credentials are generated by Azure.

Default: No value (no credentials are provided for the external stage)

ENCRYPTION = ( cloud_specific_encryption )

Required only for loading from/unloading into encrypted files; not required if files are unencrypted

AWS S3

ENCRYPTION = ( [ TYPE = AWS_CSE ] [ MASTER_KEY = '<string>' ] | [ TYPE = AWS_SSE_S3 ] | [ TYPE = AWS_SSE_KMS [ KMS_KEY_ID = '<string>' ] | [ TYPE = NONE ] )

Specifies the encryption settings used to decrypt encrypted files in the S3 bucket and extract data:

TYPE = ...

Specifies the encryption type used. Possible values are:

  • AWS_CSE: Client-side encryption (requires a MASTER_KEY value). Currently, the client-side master key you provide can only be a symmetric key. Note that, when a MASTER_KEY value is provided, Snowflake assumes TYPE = AWS_CSE (i.e. when a MASTER_KEY value is provided, TYPE is not required).

  • AWS_SSE_S3: Server-side encryption that requires no additional encryption settings.

  • AWS_SSE_KMS: Server-side encryption that accepts an optional KMS_KEY_ID value.

For more information about the encryption types, see the AWS documentation for client-side encryption or server-side encryption.

MASTER_KEY = 'string' (applies only to AWS_CSE encryption)

Specifies the client-side master key that was used to encrypt the files in the bucket. The master key must be a 128-bit or 256-bit key in Base64-encoded form. Snowflake requires this key to decrypt encrypted files in the bucket and extract data for loading.

KMS_KEY_ID = 'string' (applies only to AWS_SSE_KMS encryption)

Optionally specifies the ID for the AWS KMS-managed key that is used to encrypt files unloaded into the bucket. If no value is provided, your default KMS key ID is used to encrypt files on unload.

Note that this value is ignored for data loading.

Microsoft Azure

ENCRYPTION = ( [ TYPE = AZURE_CSE | NONE ] [ MASTER_KEY = 'string' ] )

Specifies the encryption settings used to decrypt the files containing the loaded data:

TYPE = ...

Specifies the encryption type used. Possible values are:

  • AZURE_CSE: Client-side encryption (requires a MASTER_KEY value).

  • NONE: No encryption.

MASTER_KEY = 'string' (applies only to AZURE_CSE encryption)

Specifies the client-side master key that Snowflake uses to encrypt the files containing the unloaded data. The master key must be a 128-bit or 256-bit key in Base64-encoded form.

Default: No value (files in the external stage are not encrypted)

Format Type Options (formatTypeOptions)

Depending on the file format type specified (FILE_FORMAT = ( TYPE = ... )), you can include one or more of the following format-specific options (separated by blank spaces, commas, or new lines):

TYPE = CSV

COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | NONE
Use

Data loading and unloading

Definition
  • When loading data, specifies the current compression algorithm for the data file. Snowflake uses this option to detect how an already-compressed data file was compressed so that the compressed data in the file can be extracted for loading.

  • When unloading data, compresses the data file using the specified compression algorithm.

Values

Supported Values

Notes

AUTO

When loading data, compression algorithm detected automatically, except for Brotli-compressed files, which cannot currently be detected automatically. When unloading data, files are automatically compressed using the default, which is gzip.

GZIP

BZ2

BROTLI

Must be specified if loading/unloading Brotli-compressed files.

ZSTD

Zstandard v0.8 (and higher) is supported.

DEFLATE

Deflate-compressed files (with zlib header, RFC1950).

RAW_DEFLATE

Raw Deflate-compressed files (without header, RFC1951).

NONE

When loading data, indicates that the files have not been compressed. When unloading data, specifies that the unloaded files are not compressed.

Default

AUTO

RECORD_DELIMITER = 'character' | NONE
Use

Data loading and unloading

Definition

Single-byte character string that separates records in an input file. Accepts common escape sequences, octal values (prefixed by \\), or hex values (prefixed by 0x). Also accepts a value of NONE.

For high-level (extended) ASCII characters, specify the octal (prefixed by \\) or hex representation (prefixed by 0x) of the character. For example, for records delimited by the cents (¢) character, specify the octal (\\242) or hex (0xA2) value.

Default

New line character. Note that “new line” is logical such that \r\n will be understood as a new line for files on a Windows platform.

FIELD_DELIMITER = 'character' | NONE
Use

Data loading and unloading

Definition

Single-byte character string that separates fields in an input file. Accepts common escape sequences, octal values, or hex values. Also accepts a value of NONE.

For high-level (extended) ASCII characters, specify the octal (prefixed by \\) or hex representation (prefixed by 0x) of the character. For example, for fields delimited by the cents (¢) character, specify the octal (\\242) or hex (0xA2) value.

Default

comma (,)

FILE_EXTENSION = 'string' | NONE
Use

Data unloading only

Definition

Specifies the extension for files unloaded to a stage. Accepts any extension. The user is responsible for specifying a file extension that can be read by any desired software or services.

Default

null, meaning the file extension is determined by the format type: .csv[compression], where compression is the extension added by the compression method, if COMPRESSION is set.

Note

If the SINGLE copy option is TRUE, then the COPY command unloads a file without a file extension by default. To specify a file extension, provide a file name and extension in the internal_location or external_location path (e.g. copy into @stage/data.csv).

SKIP_HEADER = integer
Use

Data loading only

Definition

Number of lines at the start of the file to skip.

Note that SKIP_HEADER does not use the RECORD_DELIMITER or FIELD_DELIMITER values to determine what a header line is; rather, it simply skips the specified number of CRLF (Carriage Return, Line Feed)-delimited lines in the file. RECORD_DELIMITER and FIELD_DELIMITER are then used to determine the rows of data to load.

Default

0

DATE_FORMAT = 'string' | AUTO
Use

Data loading and unloading

Definition

Defines the format of date values in the data files (data loading) or table (data unloading). If a value is not specified or is AUTO, the value for the DATE_INPUT_FORMAT (data loading) or DATE_OUTPUT_FORMAT (data unloading) parameter is used.

Default

AUTO

TIME_FORMAT = 'string' | AUTO
Use

Data loading and unloading

Definition

Defines the format of time values in the data files (data loading) or table (data unloading). If a value is not specified or is AUTO, the value for the TIME_INPUT_FORMAT (data loading) or TIME_OUTPUT_FORMAT (data unloading) parameter is used.

Default

AUTO

TIMESTAMP_FORMAT = string' | AUTO
Use

Data loading and unloading

Definition

Defines the format of timestamp values in the data files (data loading) or table (data unloading). If a value is not specified or is AUTO, the value for the TIMESTAMP_INPUT_FORMAT (data loading) or TIMESTAMP_OUTPUT_FORMAT (data unloading) parameter is used.

Default

AUTO

BINARY_FORMAT = HEX | BASE64 | UTF8
Use

Data loading and unloading

Definition

Defines the encoding format for binary input or output. The option can be used when loading data into or unloading data from binary columns in a table.

Default

HEX

ESCAPE = 'character' | NONE
Use

Data loading and unloading

Definition

Single character string used as the escape character for any field values. Accepts common escape sequences, octal values, or hex values. Also accepts a value of NONE.

  • When loading data, specifies the escape character for enclosed fields. Specify the character used to enclose fields by setting FIELD_OPTIONALLY_ENCLOSED_BY.

  • When unloading data, if this option is set, it overrides the escape character set for ESCAPE_UNENCLOSED_FIELD.

Default

NONE

ESCAPE_UNENCLOSED_FIELD = 'character' | NONE
Use

Data loading and unloading

Definition

Single character string used as the escape character for unenclosed field values only. Accepts common escape sequences, octal values, or hex values. Also accepts a value of NONE.

When unloading data, if ESCAPE is set, the escape character set for that option overrides this option.

Default

backslash (\\)

TRIM_SPACE = TRUE | FALSE
Use

Data loading only

Definition

Boolean that specifies whether to remove white space from fields.

For example, if your external database software encloses fields in quotes, but inserts a leading space, Snowflake reads the leading space rather than the opening quotation character as the beginning of the field (i.e. the quotation marks are interpreted as part of the string of field data). Set this option to TRUE to remove undesirable spaces during the data load.

As another example, if leading or trailing spaces surround quotes that enclose strings, you can remove the surrounding spaces using this option and the quote character using the FIELD_OPTIONALLY_ENCLOSED_BY option. Note that any spaces within the quotes are preserved. For example, assuming FIELD_DELIMITER = '|' and FIELD_OPTIONALLY_ENCLOSED_BY = '"':

|"Hello world"|    /* loads as */  >Hello world<
|" Hello world "|  /* loads as */  > Hello world <
| "Hello world" |  /* loads as */  >Hello world<

(the brackets in this example are not loaded; they are used to demarcate the beginning and end of the loaded strings)

Default

FALSE

FIELD_OPTIONALLY_ENCLOSED_BY = 'character' | NONE
Use

Data loading and unloading

Definition

Character used to enclose strings. Value can be NONE, single quote character ('), or double quote character ("). To use the single quote character, use the octal or hex representation (0x27) or the double single-quoted escape ('').

When a field contains this character, escape it using the same character. For example, if the value is the double quote character and a field contains the string A "B" C, escape the double quotes as follows:

A ""B"" C

Default

NONE

NULL_IF = ( 'string1' [ , 'string2' , ... ] )
Use

Data loading and unloading

Definition

String used to convert to and from SQL NULL:

  • When loading data, Snowflake replaces these strings in the data load source with SQL NULL. To specify more than one string, enclose the list of strings in parentheses and use commas to separate each value.

    For example:

    NULL_IF = ('\\N', 'NULL', 'NUL', '')

    Note that this option can include empty strings and only applies to columns that are nullable.

  • When unloading data, Snowflake converts SQL NULL values to the first value in the list.

Default

\\N (i.e. NULL, which assumes the ESCAPE_UNENCLOSED_FIELD value is \\

ERROR_ON_COLUMN_COUNT_MISMATCH = TRUE | FALSE
Use

Data loading only

Definition

Boolean that specifies whether to generate a parsing error if the number of delimited columns (i.e. fields) in an input file does not match the number of columns in the corresponding table.

If set to FALSE, an error is not generated and the load continues. If the file is successfully loaded:

  • If the input file contains records with more fields than columns in the table, the matching fields are loaded in order of occurrence in the file and the remaining fields are not loaded.

  • If the input file contains records with fewer fields than columns in the table, the non-matching columns in the table are loaded with NULL values.

This option assumes all the records within the input file are the same length (i.e. a file containing records of varying length return an error regardless of the value specified for this parameter).

Default

TRUE

Note

When transforming data during loading (i.e. using a query as the source for the COPY command), this option is ignored. There is no requirement for your data files to have the same number and ordering of columns as your target table.

VALIDATE_UTF8 = TRUE | FALSE
Use

Data loading only

Definition

Boolean that specifies whether to validate UTF-8 character encoding in string column data.

If set to TRUE, Snowflake validates UTF-8 character encoding in string column data. When invalid UTF-8 character encoding is detected, the COPY command produces an error.

Default

TRUE

Important

This option is provided only to ensure backward compatibility with earlier versions of Snowflake. You should not disable this option unless instructed by Snowflake Support.

EMPTY_FIELD_AS_NULL = TRUE | FALSE
Use

Data loading and unloading

Definition

Boolean that specifies whether to insert SQL NULL for empty fields in an input file.

If set to FALSE, Snowflake attempts to cast an empty field to the corresponding column type. An empty string is inserted into columns of type STRING. For other column types, the COPY command produces an error.

Default

TRUE

SKIP_BYTE_ORDER_MARK = TRUE | FALSE
Use

Data loading only

Definition

Boolean that specifies whether to skip the BOM (byte order mark), if present in a data file. A BOM is a character code at the beginning of a data file that defines the byte order and encoding form.

If set to FALSE, Snowflake recognizes any BOM in data files, which could result in the BOM either causing an error or being merged into the first column in the table.

Default

TRUE

ENCODING = 'string'
Use

Data loading only

Definition

String (constant) that specifies the character set of the source data when loading data into a table.

Character Set

ENCODING Value

Supported Languages

Notes

Big5

BIG5

Traditional Chinese

EUC-JP

EUCJP

Japanese

EUC-KR

EUCKR

Korean

GB18030

GB18030

Chinese

IBM420

IBM420

Arabic

IBM424

IBM424

Hebrew

ISO-2022-CN

ISO2022CN

Simplified Chinese

ISO-2022-JP

ISO2022JP

Japanese

ISO-2022-KR

ISO2022KR

Korean

ISO-8859-1

ISO88591

Danish, Dutch, English, French, German, Italian, Norwegian, Portuguese, Swedish

ISO-8859-2

ISO88592

Czech, Hungarian, Polish, Romanian

ISO-8859-5

ISO88595

Russian

ISO-8859-6

ISO88596

Arabic

ISO-8859-7

ISO88597

Greek

ISO-8859-8

ISO88598

Hebrew

ISO-8859-9

ISO88599

Turkish

KOI8-R

KOI8R

Russian

Shift_JIS

SHIFTJIS

Japanese

UTF-8

UTF8

All languages

For loading data from delimited files (CSV, TSV, etc.), UTF-8 is the default. . . For loading data from all other supported file formats (JSON, Avro, etc.), as well as unloading data, UTF-8 is the only supported character set.

UTF-16

UTF16

All languages

UTF-16BE

UTF16BE

All languages

UTF-16LE

UTF16LE

All languages

UTF-32

UTF32

All languages

UTF-32BE

UTF32BE

All languages

UTF-32LE

UTF32LE

All languages

windows-1250

WINDOWS1250

Czech, Hungarian, Polish, Romanian

windows-1251

WINDOWS1251

Russian

windows-1252

WINDOWS1252

Danish, Dutch, English, French, German, Italian, Norwegian, Portuguese, Swedish

windows-1253

WINDOWS1253

Greek

windows-1254

WINDOWS1254

Turkish

windows-1255

WINDOWS1255

Hebrew

windows-1256

WINDOWS1256

Arabic

Default

UTF8

Note

Snowflake stores all data internally in the UTF-8 character set. The data is converted into UTF-8 before it is loaded into Snowflake. If the VALIDATE_UTF8 file format option is TRUE, Snowflake validates the UTF-8 character encoding in string column data after it is converted from its original character encoding.

TYPE = JSON

COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | NONE
Use

Data loading and unloading

Definition
  • When loading data, specifies the current compression algorithm for the data file. Snowflake uses this option to detect how an already-compressed data file was compressed so that the compressed data in the file can be extracted for loading.

  • When unloading data, compresses the data file using the specified compression algorithm.

Values

Supported Values

Notes

AUTO

When loading data, compression algorithm detected automatically, except for Brotli-compressed files, which cannot currently be detected automatically. When unloading data, files are automatically compressed using the default, which is gzip.

GZIP

BZ2

BROTLI

Must be specified if loading/unloading Brotli-compressed files.

ZSTD

Zstandard v0.8 (and higher) is supported.

DEFLATE

Deflate-compressed files (with zlib header, RFC1950).

RAW_DEFLATE

Raw Deflate-compressed files (without header, RFC1951).

NONE

When loading data, indicates that the files have not been compressed. When unloading data, specifies that the unloaded files are not compressed.

Default

AUTO

FILE_EXTENSION = 'string' | NONE
Use

Data unloading only

Definition

Specifies the extension for files unloaded to a stage. Accepts any extension. The user is responsible for specifying a file extension that can be read by any desired software or services.

Default

null, meaning the file extension is determined by the format type: .json[compression], where compression is the extension added by the compression method, if COMPRESSION is set.

ENABLE_OCTAL = TRUE | FALSE
Use

Data loading only

Definition

Boolean that enables parsing of octal numbers.

Default

FALSE

ALLOW_DUPLICATE = TRUE | FALSE
Use

Data loading only

Definition

Boolean that specifies to allow duplicate object field names (only the last one will be preserved).

Default

FALSE

STRIP_OUTER_ARRAY = TRUE | FALSE
Use

Data loading only

Definition

Boolean that instructs the JSON parser to remove outer brackets (i.e [ ]).

Default

FALSE

STRIP_NULL_VALUES = TRUE | FALSE
Use

Data loading only

Definition

Boolean that instructs the JSON parser to remove object fields or array elements containing null values. For example, when set to TRUE:

Before

After

[null]

[]

[null,null,3]

[,,3]

{"a":null,"b":null,"c":123}

{"c":123}

{"a":[1,null,2],"b":{"x":null,"y":88}}

{"a":[1,,2],"b":{"y":88}}

Default

FALSE

IGNORE_UTF8_ERRORS = TRUE | FALSE
Use

Data loading only

Definition

Boolean that specifies whether UTF-8 encoding errors produce error conditions. If set to TRUE, any invalid UTF-8 sequences are silently replaced with Unicode character U+FFFD (i.e. “replacement character”).

Default

FALSE

SKIP_BYTE_ORDER_MARK = TRUE | FALSE
Use

Data loading only

Definition

Boolean that specifies whether to skip the BOM (byte order mark), if present in a data file. A BOM is a character code at the beginning of a data file that defines the byte order and encoding form.

If set to FALSE, Snowflake recognizes any BOM in data files, which could result in the BOM either causing an error or being merged into the first column in the table.

Default

TRUE

TYPE = AVRO

COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | NONE
Use

Data loading and unloading

Definition
  • When loading data, specifies the current compression algorithm for the data file. Snowflake uses this option to detect how an already-compressed data file was compressed so that the compressed data in the file can be extracted for loading.

  • When unloading data, compresses the data file using the specified compression algorithm.

Values

Supported Values

Notes

AUTO

When loading data, compression algorithm detected automatically, except for Brotli-compressed files, which cannot currently be detected automatically. When unloading data, files are automatically compressed using the default, which is gzip.

GZIP

BZ2

BROTLI

Must be specified if loading/unloading Brotli-compressed files.

ZSTD

Zstandard v0.8 (and higher) is supported.

DEFLATE

Deflate-compressed files (with zlib header, RFC1950).

RAW_DEFLATE

Raw Deflate-compressed files (without header, RFC1951).

NONE

When loading data, indicates that the files have not been compressed. When unloading data, specifies that the unloaded files are not compressed.

Default

AUTO

TYPE = PARQUET

COMPRESSION = AUTO | LZO | SNAPPY | NONE
Use

Data loading and unloading

Definition
  • When loading data, specifies the current compression algorithm for the data file. Snowflake uses this option to detect how an already-compressed data file was compressed so that the compressed data in the file can be extracted for loading.

  • When unloading data, compresses the data file using the specified compression algorithm.

Values

Supported Values

Notes

AUTO

When loading data, compression algorithm detected automatically. When unloading data, unloaded files are compressed using the Snappy compression algorithm by default.

LZO

Supported for data unloading only. Must be specified if unloading LZO-compressed files. Support for LZO compression is in public preview.

SNAPPY

May be specified if unloading Snappy-compressed files.

NONE

When loading data, indicates that the files have not been compressed. When unloading data, specifies that the unloaded files are not compressed.

Default

AUTO

SNAPPY_COMPRESSION = TRUE | FALSE
Use

Data unloading only

Definition

Boolean that specifies whether unloaded file(s) are compressed using the SNAPPY algorithm.

Limitations

Only supported for data unloading operations.

Default

TRUE

BINARY_AS_TEXT = TRUE | FALSE
Use

Data loading only

Definition

Boolean that specifies whether to interpret columns with no defined logical data type as UTF-8 text. When set to FALSE, Snowflake interprets these columns as binary data.

Limitations

Only supported for data loading operations.

Default

TRUE

TYPE = XML

COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | NONE
Use

Data loading and unloading

Definition
  • When loading data, specifies the current compression algorithm for the data file. Snowflake uses this option to detect how an already-compressed data file was compressed so that the compressed data in the file can be extracted for loading.

  • When unloading data, compresses the data file using the specified compression algorithm.

Values

Supported Values

Notes

AUTO

When loading data, compression algorithm detected automatically, except for Brotli-compressed files, which cannot currently be detected automatically. When unloading data, files are automatically compressed using the default, which is gzip.

GZIP

BZ2

BROTLI

Must be specified if loading/unloading Brotli-compressed files.

ZSTD

Zstandard v0.8 (and higher) is supported.

DEFLATE

Deflate-compressed files (with zlib header, RFC1950).

RAW_DEFLATE

Raw Deflate-compressed files (without header, RFC1951).

NONE

When loading data, indicates that the files have not been compressed. When unloading data, specifies that the unloaded files are not compressed.

Default

AUTO

IGNORE_UTF8_ERRORS = TRUE | FALSE
Use

Data loading only

Definition

Boolean that specifies whether UTF-8 encoding errors produce error conditions. If set to TRUE, any invalid UTF-8 sequences are silently replaced with Unicode character U+FFFD (i.e. “replacement character”).

Default

FALSE

PRESERVE_SPACE = TRUE | FALSE
Use

Data loading only

Definition

Boolean that specifies whether the XML parser preserves leading and trailing spaces in element content.

Default

FALSE

STRIP_OUTER_ELEMENT = TRUE | FALSE
Use

Data loading only

Definition

Boolean that specifies whether the XML parser strips out the outer XML element, exposing 2nd level elements as separate documents.

Default

FALSE

DISABLE_SNOWFLAKE_DATA = TRUE | FALSE
Use

Data loading only

Definition

Boolean that specifies whether the XML parser disables recognition of Snowflake semi-structured data tags.

Default

FALSE

DISABLE_AUTO_CONVERT = TRUE | FALSE
Use

Data loading only

Definition

Boolean that specifies whether the XML parser disables automatic conversion of numeric and Boolean values from text to native representation.

Default

FALSE

SKIP_BYTE_ORDER_MARK = TRUE | FALSE
Use

Data loading only

Definition

Boolean that specifies whether to skip any BOM (byte order mark) present in an input file. A BOM is a character code at the beginning of a data file that defines the byte order and encoding form.

If set to FALSE, Snowflake recognizes any BOM in data files, which could result in the BOM either causing an error or being merged into the first column in the table.

Default

TRUE

Note

TYPE = ORC does not support any format type options.

Copy Options (copyOptions)

You can specify one or more of the following copy options (separated by blank spaces, commas, or new lines):

COPY_OPTIONS = ( ... )

ON_ERROR = CONTINUE | SKIP_FILE | SKIP_FILE_num | SKIP_FILE_num% | ABORT_STATEMENT

String (constant) that specifies the action to perform when an error is encountered while loading data from a file:

Supported Values

Notes

CONTINUE

Continue loading the file.

SKIP_FILE

Skip file if any errors encountered in the file.

SKIP_FILE_num (e.g. SKIP_FILE_10)

Skip file when the number of errors in the file is equal to or exceeds the specified number.

SKIP_FILE_num% (e.g. SKIP_FILE_10%)

Skip file when the percentage of errors in the file exceeds the specified percentage.

ABORT_STATEMENT

Abort the COPY statement if any error is encountered.

Default: ABORT_STATEMENT

SIZE_LIMIT = num

Number (> 0) that specifies the maximum size (in bytes) of data to be loaded for a given COPY statement. When the threshold is exceeded, the COPY operation discontinues loading files. This option is commonly used to load a common group of files using multiple COPY statements. For each statement, the data load continues until the specified SIZE_LIMIT is exceeded, before moving on to the next statement.

For example, suppose a set of files in a stage path were each 10 MB in size. If multiple COPY statements set SIZE_LIMIT to 25000000 (25 MB), each would load 3 files. That is, each COPY operation would discontinue after the SIZE_LIMIT threshold was exceeded.

Note that at least one file is loaded regardless of the value specified for SIZE_LIMIT:code: unless there is no file to be loaded.

Default: null (no size limit)

PURGE = TRUE | FALSE

Boolean that specifies whether to purge the data files from the location automatically after the data is successfully loaded.

Default: FALSE

RETURN_FAILED_ONLY = TRUE | FALSE

Boolean that specifies whether to return only files that have failed to load in the statement result.

Default: FALSE

ENFORCE_LENGTH = TRUE | FALSE

Alternative syntax for TRUNCATECOLUMNS with reverse logic (for compatibility with other systems)

Boolean that specifies whether to truncate text strings that exceed the target column length:

  • If TRUE, the COPY statement produces an error if a loaded string exceeds the target column length.

  • If FALSE, strings are automatically truncated to the target column length.

Default: TRUE

Note

  • If the length of the target string column is set to the maximum (e.g. VARCHAR (16777216)), an incoming string cannot exceed this length; otherwise, the COPY command produces an error.

  • This parameter is functionally equivalent to TRUNCATECOLUMNS, but has the opposite behavior. It is provided for compatibility with other databases. It is only necessary to include one of these two parameters in a COPY statement to produce the desired output.

TRUNCATECOLUMNS = TRUE | FALSE

Alternative syntax for ENFORCE_LENGTH with reverse logic (for compatibility with other systems)

Boolean that specifies whether to truncate text strings that exceed the target column length:

  • If TRUE, strings are automatically truncated to the target column length.

  • If FALSE, the COPY statement produces an error if a loaded string exceeds the target column length.

Default: FALSE

Note

  • If the length of the target string column is set to the maximum (e.g. VARCHAR (16777216)), an incoming string cannot exceed this length; otherwise, the COPY command produces an error.

  • This parameter is functionally equivalent to ENFORCE_LENGTH, but has the opposite behavior. It is provided for compatibility with other databases. It is only necessary to include one of these two parameters in a COPY statement to produce the desired output.

FORCE = TRUE | FALSE

Boolean that specifies to load all files, regardless of whether they’ve been loaded previously and have not changed since they were loaded. Note that this option reloads files, potentially duplicating data in a table.

Default: FALSE

LOAD_UNCERTAIN_FILES = TRUE | FALSE

Boolean that specifies to load files for which the load status is unknown. The COPY command skips these files by default.

The load status is unknown if all of the following conditions are true:

  • The file’s LAST_MODIFIED date (i.e. date when the file was staged) is older than 64 days.

  • The initial set of data was loaded into the table more than 64 days earlier.

  • If the file was already loaded successfully into the table, this event occurred more than 64 days earlier.

To force the COPY command to load all files regardless of whether the load status is known, use the FORCE option instead.

For more information about load status uncertainty, see Loading Older Files.

Default: FALSE

Usage Notes

  • URL, STORAGE_INTEGRATION, CREDENTIALS, and ENCRYPTION only apply to external stages. If you are creating an internal stage, you can ignore these parameters.

  • CREATE STAGE does not check whether the specified URL or credentials are valid. If the credentials are not valid, when you attempt to use the stage, the system returns an error.

  • If referencing a file format in the current namespace, you can omit the single quotes around the format identifier.

Examples

Internal Stages

Create an internal stage named my_int_stage with the default file format type (CSV):

  • All the corresponding default CSV file format options are used.

  • All the default copy options are used, except for ON_ERROR. If a COPY INTO <table> command that references this stage encounters a data error on any of the records, it skips the file.

CREATE OR REPLACE STAGE my_int_stage
  COPY_OPTIONS = (ON_ERROR='skip_file');

Create a temporary internal stage named my_temp_int_stage with all the same properties as the previous example, except the copy option to skip files on error:

CREATE OR REPLACE TEMPORARY STAGE my_temp_int_stage;

Create a temporary internal stage named my_int_stage that references a file format named my_csv_format (created using CREATE FILE FORMAT):

CREATE OR REPLACE TEMPORARY STAGE my_int_stage
  FILE_FORMAT = my_csv_format;

When you reference the stage in a COPY INTO <table> statement, the file format options are automatically set

External Stages

AWS S3

Create an external stage named my_ext_stage using a private/protected S3 bucket named load with a folder path named files. Secure access to the S3 bucket is provided via the load storage integration:

CREATE OR REPLACE STAGE my_ext_stage URL='s3://load/files/'
    STORAGE_INTEGRATION = myint;

Create an external stage named my_ext_stage1 using a private/protected S3 bucket named load with a folder path named files. The Snowflake access permissions for the S3 bucket are associated with an IAM user; therefore, IAM credentials are required:

CREATE OR REPLACE STAGE my_ext_stage URL='s3://load/files/'
    CREDENTIALS=(AWS_KEY_ID='1a2b3c' AWS_SECRET_KEY='4x5y6z');

Note that the AWS_KEY_ID and AWS_SECRET_KEY values used in this example are for illustration purposes only.

Create an external stage named my_ext_stage2 using an S3 bucket named load with a folder path named encrypted_files and client-side encryption (default encryption type) with the master key to decrypt/encrypt files stored in the bucket:

CREATE OR REPLACE STAGE my_ext_stage2 URL='s3://load/encrypted_files/'
    CREDENTIALS=(AWS_KEY_ID='1a2b3c' AWS_SECRET_KEY='4x5y6z')
    ENCRYPTION=(MASTER_KEY = 'eSxX0jzYfIamtnBKOEOwq80Au6NbSgPH5r4BDDwOaO8=');

Create an external stage named my_ext_stage3 using an S3 bucket named load with a folder path named encrypted_files and AWS_SSE_KMS server-side encryption with the ID for the master key to decrypt/encrypt files stored in the bucket:

CREATE OR REPLACE STAGE my_ext_stage3 URL='s3://load/encrypted_files/'
  CREDENTIALS=(AWS_KEY_ID='1a2b3c' AWS_SECRET_KEY='4x5y6z')
  ENCRYPTION=(TYPE='AWS_SSE_KMS' KMS_KEY_ID = 'aws/key');

Same example as the immediately preceding example, except that the Snowflake access permissions for the S3 bucket as associated with an IAM role instead of an IAM user. Note that credentials are handled separately from other stage parameters such as ENCRYPTION. Support for these other parameters is the same regardless of the credentials used to access your external S3 bucket:

CREATE OR REPLACE STAGE my_ext_stage3 URL='s3://load/encrypted_files/'
  CREDENTIALS=(AWS_ROLE='arn:aws:iam::001234567890:role/mysnowflakerole')
  ENCRYPTION=(TYPE='AWS_SSE_KMS' KMS_KEY_ID = 'aws/key');

Google Cloud Storage

Create an external stage named my_ext_stage using a private/protected GCS bucket named load with a folder path named files. Secure access to the GCS bucket is provided via the load storage integration:

CREATE OR REPLACE STAGE my_ext_stage URL='gcs://load/files/'
    STORAGE_INTEGRATION = myint;

Microsoft Azure

Create an external stage named mystage using an Azure storage account named myaccount and a container named mycontainer with a folder path named files and client-side encryption enabled. The stage references a file format named my_csv_format:

CREATE OR REPLACE STAGE mystage
  URL='azure://myaccount.blob.core.windows.net/mycontainer/files/'
  CREDENTIALS=(AZURE_SAS_TOKEN='?sv=2016-05-31&ss=b&srt=sco&sp=rwdl&se=2018-06-27T10:05:50Z&st=2017-06-27T02:05:50Z&spr=https,http&sig=bgqQwoXwxzuD2GJfagRg7VOS8hzNr3QLT7rhS8OFRLQ%3D')
  ENCRYPTION=(TYPE='AZURE_CSE' MASTER_KEY = 'kPxX0jzYfIamtnJEUTHwq80Au6NbSgPH5r4BDDwOaO8=')
  FILE_FORMAT = my_csv_format;

(The AZURE_SAS_TOKEN and MASTER_KEY values used in this example are not actual values; they are provided for illustration purposes only.)