Summary of Data Loading Features

This topic provides a quick-reference of the supported features for using the COPY INTO <table> command to load data from files into Snowflake tables.

In this Topic:

Data File Details

The following table describes the general details for the files used to load data:

Feature

Supported

Notes

Location of files

Local environment

Files are first staged in a Snowflake stage, then loaded into a table.

AWS S3

Files can be loaded directly from any user-supplied S3 bucket.

Google Cloud Storage

Files can be loaded directly from any user-supplied Cloud Storage container.

Microsoft Azure

Files can be loaded directly from any user-supplied Azure container.

File formats

Delimited files (CSV, TSV, etc.)

Any valid delimiter is supported; default is comma (i.e. CSV).

JSON

Avro

Includes automatic detection and processing of staged Avro files that were compressed using Snappy.

ORC

Includes automatic detection and processing of staged ORC files that were compressed using Snappy or zlib.

Parquet

Includes automatic detection and processing of staged Parquet files that were compressed using Snappy.

XML

Supported as a preview feature.

File encoding

File format-specific

For delimited files (CSV, TSV, etc.), the default character set is UTF-8. To use any other characters sets, you must explicitly specify the encoding to use for loading. For the list of supported character sets, see below.

For all other supported file formats (JSON, Avro, etc.), the only supported character set is UTF-8.

Supported Character Sets for Delimited Files

The following table lists the encoding character sets supported for loading data from delimited files (CSV, TSV, etc.):

Character Set

ENCODING Value

Supported Languages

Notes

Big5

BIG5

Traditional Chinese

EUC-JP

EUCJP

Japanese

EUC-KR

EUCKR

Korean

GB18030

GB18030

Chinese

IBM420

IBM420

Arabic

IBM424

IBM424

Hebrew

ISO-2022-CN

ISO2022CN

Simplified Chinese

ISO-2022-JP

ISO2022JP

Japanese

ISO-2022-KR

ISO2022KR

Korean

ISO-8859-1

ISO88591

Danish, Dutch, English, French, German, Italian, Norwegian, Portuguese, Swedish

ISO-8859-2

ISO88592

Czech, Hungarian, Polish, Romanian

ISO-8859-5

ISO88595

Russian

ISO-8859-6

ISO88596

Arabic

ISO-8859-7

ISO88597

Greek

ISO-8859-8

ISO88598

Hebrew

ISO-8859-9

ISO88599

Turkish

KOI8-R

KOI8R

Russian

Shift_JIS

SHIFTJIS

Japanese

UTF-8

UTF8

All languages

For loading data from delimited files (CSV, TSV, etc.), UTF-8 is the default. . . For loading data from all other supported file formats (JSON, Avro, etc.), as well as unloading data, UTF-8 is the only supported character set.

UTF-16

UTF16

All languages

UTF-16BE

UTF16BE

All languages

UTF-16LE

UTF16LE

All languages

UTF-32

UTF32

All languages

UTF-32BE

UTF32BE

All languages

UTF-32LE

UTF32LE

All languages

windows-1250

WINDOWS1250

Czech, Hungarian, Polish, Romanian

windows-1251

WINDOWS1251

Russian

windows-1252

WINDOWS1252

Danish, Dutch, English, French, German, Italian, Norwegian, Portuguese, Swedish

windows-1253

WINDOWS1253

Greek

windows-1254

WINDOWS1254

Turkish

windows-1255

WINDOWS1255

Hebrew

windows-1256

WINDOWS1256

Arabic

Compression of Staged Files

The following table describes how Snowflake handles compression of data files for loading. The options are different depending on whether the files are staged uncompressed or already-compressed:

Feature

Supported

Notes

Uncompressed files

gzip

When staging uncompressed files in a Snowflake stage, the files are automatically compressed using gzip, unless compression is explicitly disabled.

Already-compressed files

gzip

bzip2

deflate

raw_deflate

Snowflake can automatically detect any of these compression methods or you can explicitly specify the method that was used compress the files.

Brotli

Zstandard

Auto-detection is not yet supported for these methods; when staging or loading files compressed with either of these methods, you must explicitly specify the compression method that was used.

Encryption of Staged Files

The following table describes how Snowflake handles encryption of data files for loading. The options are different depending on whether the files are staged unencrypted or already-encrypted:

Feature

Supported

Notes

Unencrypted files

128-bit or 256-bit keys

When staging unencrypted files in a Snowflake internal location, the files are automatically encrypted using 128-bit keys. 256-bit keys can be enabled (for stronger encryption); however, additional configuration is required.

Already-encrypted files

User-supplied key

Files that are already encrypted can be loaded into Snowflake from external cloud storage; the key used to encrypt the files must be provided to Snowflake.