Categories:

Aggregate Functions (Cardinality Estimation) , Window Functions

HLL

Uses HyperLogLog to return an approximation of the distinct cardinality of the input (i.e. HLL(col1, col2, ... ) returns an approximation of COUNT(DISTINCT col1, col2, ... )).

For more information about HyperLogLog, see Estimating Number of Distinct Values.

Aliases:

APPROX_COUNT_DISTINCT.

See also:

HLL_ACCUMULATE , HLL_COMBINE , HLL_ESTIMATE

Syntax

Aggregate function

HLL( [ DISTINCT ] <expr> [ , ... ] )

HLL(*)

Window function

HLL( [ DISTINCT ] <expr> [ , ... ] ) OVER ( [ PARTITION BY <expr3> ] )

HLL(*) OVER ( [ PARTITION BY <expr3> ] )

Arguments

expr1

This is the expression for which you want to know the number of distinct values.

expr2

This is the optional expression used to group rows into partitions.

Returns

The data type of the returned value is INTEGER.

Usage Notes

  • DISTINCT can be included as an argument, but has no effect.

  • For information about NULL values and aggregate functions, see Aggregate Functions and NULL Values.

  • When used as a window function:

    • This function does not support:

      • ORDER BY sub-clause in the OVER() clause.

      • Window frames.

Examples

This example shows how to use HLL and its alias APPROX_COUNT_DISTINCT. This example calls both COUNT(DISTINCT i) and APPROX_COUNT_DISTINCT(i) to emphasize that the results do not always match exactly.

SELECT COUNT(i), COUNT(DISTINCT i), APPROX_COUNT_DISTINCT(i), HLL(i)
  FROM sequence_demo;

Output:

The results might vary because APPROX_COUNT_DISTINCT() returns an approximation, not an exact value.

SELECT COUNT(i), COUNT(DISTINCT i), APPROX_COUNT_DISTINCT(i), HLL(i)
  FROM sequence_demo;
+----------+-------------------+--------------------------+--------+
| COUNT(I) | COUNT(DISTINCT I) | APPROX_COUNT_DISTINCT(I) | HLL(I) |
|----------+-------------------+--------------------------+--------|
|     1024 |              1024 |                     1030 |   1030 |
+----------+-------------------+--------------------------+--------+