Categories:
Semi-structured Data Functions (Parsing)

PARSE_XML

Interprets an input string as an XML document, producing an OBJECT value. If the input is NULL, the output is NULL.

Usage Notes

Every elementary value in XML documents is text; however it is not very useful for processing the data.

PARSE_XML attempts to convert obviously numeric and Boolean values to the native representation in a way that printing these values back produces textually identical results. When parsing decimal numbers, PARSE_XML attempts to preserve exactness of the representation by treating 123.45 as NUMBER(5,2), not as a DOUBLE. However, numbers in scientific notation (i.e. 1.2345e+02) or numbers that cannot be stored as fixed-point decimals due to range or scale limitations are stored as DOUBLE. Because XML does not represent values such as TIMESTAMP, DATE, TIME, or BINARY natively, these have to be represented as strings.

In the current Snowflake release, PARSE_XML does not provide automatic recognition of these values; they are treated as strings (and the explicit conversion from strings is required).

Examples

CREATE TABLE xtab (v OBJECT);

INSERT INTO xtab SELECT PARSE_XML(column1) AS v
  FROM VALUES ('<a/>'), ('<a attr="123">text</a>'), ('<a><b>X</b><b>Y</b></a>');

SELECT * FROM xtab;

-------------------------+
           V             |
-------------------------+
 <a></a>                 |
 <a attr="123">text</a>  |
 <a><b>X</b><b>Y</b></a> |
-------------------------+