Subqueries and Subquery Operators

A subquery is a query within another query. Subqueries in a FROM or WHERE clause are used to provide data that will be used to limit or compare/evaluate the data returned by the containing query. Subqueries can be categorized as correlated or uncorrelated:

  • A correlated subquery refers to columns from outside of the subquery. A correlated subquery can be thought of as a filter on the table that it refers to, as if the subquery were evaluated on each row of the table.
  • An uncorrelated subquery has no such external column references.

Snowflake currently supports the following types of subqueries:

  • Uncorrelated scalar subqueries in any place that a value expression can be used.
  • Correlated scalar subqueries in WHERE clauses.
  • EXISTS, ANY / ALL, and IN subqueries in WHERE clauses. These subqueries may be correlated or uncorrelated.

Subquery operators operate on nested query expressions. They can be used to compute values that are:

In this Topic:

Demonstrating the Difference between a Correlated and a Non-Correlated Subquery

This query shows an uncorrelated subquery in a WHERE clause. The subquery gets the per capita GDP of Brazil, and the outer query selects all the jobs (in any country) that pay less than the per-capita GDP of Brazil. The subquery is uncorrelated because the value that it returns does not depend upon any column of the outer query. The subquery only needs to be called once during the entire execution of the outer query.

SELECT p.name, p.annual_wage, p.country
  FROM pay AS p
  WHERE p.annual_wage < (SELECT per_capita_GDP
                           FROM international_GDP
                           WHERE name = 'Brazil');

This shows a correlated subquery in a WHERE clause. This query lists jobs where the annual pay of the job is less than the per-capita GDP in that country. This subquery is correlated because it is called once for each row in the outer query and is passed a value (p.country (country name)) from that row.

SELECT p.name, p.annual_wage, p.country
  FROM pay AS p
  WHERE p.annual_wage < (SELECT MAX(per_capita_GDP)
                           FROM international_GDP i
                           WHERE p.country = i.name);

Note

The aggregate function MAX() is not logically necessary in this case because the international_GDP table has only one row per country; however, because the server doesn’t know that, and because the server requires that the subquery return no more than one row, we use the aggregate function to force the server to recognize that the subquery will return only one row each time that the subquery is executed. In this example, MAX(), MIN(), or AVG() would all have worked because applying any of these to a single value will return that value unchanged.

Scalar Subqueries

A scalar subquery is a subquery that returns at most one row. A scalar subquery can appear anywhere that a value expression can appear, including the SELECT list, GROUP BY clause, or as an argument to a function in a WHERE or HAVING clause.

Syntax

( <query> )

Usage Notes

  • A scalar subquery can contain only one item in the SELECT list.
  • If a scalar subquery returns more than one row, a runtime error is generated.
  • Correlated scalar subqueries are currently supported only if they can be statically determined to return one row, e.g. if the SELECT list contains an aggregate function with no GROUP BY.
  • Uncorrelated scalar subqueries are supported anywhere that a value expression is allowed.
  • Subqueries with a correlation inside of FLATTEN are currently unsupported.

Examples

This shows a simple uncorrelated subquery in a WHERE clause:

SELECT employee_id
FROM employees
WHERE salary = (SELECT max(salary) FROM employees);

This shows an uncorrelated subquery in a FROM clause; this simple subquery merely gets a subset of the information in the international_GDP table. The overall query lists jobs in “high-wage” countries where the annual pay of the job is the same as the per_capita_GDP in that country.

SELECT p.name, p.annual_wage, p.country
  FROM pay AS p INNER JOIN (SELECT name, per_capita_GDP
                              FROM international_GDP
                              WHERE per_capita_GDP >= 10000.0) AS pcg
    ON pcg.per_capita_GDP = p.annual_wage AND p.country = pcg.name;

[ NOT ] EXISTS Subqueries

An EXISTS subquery is a boolean expression that can appear in a WHERE or HAVING clause, or in any function that operates on a boolean expression:

  • An EXISTS expression evaluates to TRUE if any rows are produced by the subquery.
  • A NOT EXISTS expression evaluates to TRUE if no rows are produced by the subquery.

Syntax

[ NOT ] EXISTS ( <query> )

Usage Notes

  • Correlated EXISTS subqueries are currently supported only in a WHERE clause.
  • Correlated EXISTS subqueries cannot appear as an argument to an OR operator.
  • Uncorrelated EXISTS subqueries are supported anywhere that a boolean expression is allowed.

Examples

Use a correlated NOT EXISTS subquery to find the departments that have no employees:

SELECT department_id
FROM departments d
WHERE NOT EXISTS (SELECT 1
                  FROM employees e
                  WHERE e.department_id = d.department_id);

ANY / ALL Subqueries

The ANY and ALL keywords can be used to apply a comparison operator to the values produced by a subquery (which can return more than one row).

Syntax

<expr> { = | != | > | >= | < | <= } ANY ( <query> )

<expr> { = | != | > | >= | < | <= } ALL ( <query> )

Usage Notes

  • The expression is compared with the operator to each value that the subquery returns:

    • If ANY is specified, then the result is TRUE if any row of the subquery satisfies the condition, otherwise it returns FALSE.
    • If ALL is specified, then the result is TRUE if every row of the subquery satisfies the condition, otherwise it returns FALSE.
  • ANY/ALL subqueries are currently supported only in a WHERE clause.

  • ANY/ALL subqueries cannot appear as an argument to an OR operator.

  • The subquery must contain only one item in its SELECT list.

Examples

Use a != ALL subquery to find the departments that have no employees:

SELECT department_id
FROM departments d
WHERE d.department_id != ALL (SELECT e.department_id
                              FROM employees e);

[ NOT ] IN Subqueries

The IN and NOT IN operators check if an expression is included or not included in the values returned by a subquery.

Syntax

<expr> [ NOT ] IN ( <query> )

Usage Notes

  • IN is shorthand for = ANY, and is subject to the same restrictions as ANY subqueries.
  • NOT IN is shorthand for != ALL, and is subject to the same restrictions as ALL subqueries.

Examples

Use a NOT IN subquery that is equivalent to the != ALL subquery in the previous example:

SELECT department_id
FROM departments d
WHERE d.department_id NOT IN (SELECT e.department_id
                              FROM employees e);