Troubleshooting Bulk Data Loads

This topic describes a methodical approach to troubleshooting issues with bulk data loads.

In this Topic:

Step 1. Viewing the COPY History for the Table

Query the load activity history for a table, including any attempted data loads using Snowpipe. For information, see COPY_HISTORY. The STATUS column indicates whether a particular set of files was loaded, partially loaded, or failed to load. The FIRST_ERROR_MESSAGE column provides a reason when an attempt partially loaded or failed.

Note that if a set of files has multiple issues, the FIRST_ERROR_MESSAGE column only indicates the first error encountered. To view all errors in the files, see Step 2. Validating the Data Load for instructions.

Step 2. Validating the Data Load

The VALIDATION_MODE copy option instructs a COPY statement to validate the data to be loaded and return results based on the validation option specified. No data is loaded when this copy option is specified. For more information about the copy option, see COPY INTO <table>.

Execute a COPY statement with the VALIDATION_MODE copy option set to RETURN_ALL_ERRORS. In the statement, reference the set of files you had attempted to load.

The following example validates a set of files that contain errors. To facilitate analysis of the errors, a COPY INTO <location> statement then unloads the problematic records into a text file so they could be analyzed and fixed in the original data files. The statement queries the RESULT_SCAN table function to retrieve the records. Note that the statements in this section must be run in succession in order to retrieve the applicable records using the LAST_QUERY_ID function.

COPY INTO mytable
  FROM @mystage/myfile.csv.gz
  VALIDATION_MODE=RETURN_ALL_ERRORS;

SET qid=last_query_id();

COPY INTO @mystage/errors/load_errors.txt FROM (SELECT rejected_record FROM TABLE(result_scan($qid)));