Categories:
Aggregate Functions (Percentile Estimation) , Window Functions

APPROX_PERCENTILE

Returns an approximated value for the desired percentile (i.e. if column c has n numbers, then APPROX_PERCENTILE(c, p) returns a number such that approximately n * p of the numbers in c are smaller than the returned number).

This function uses the t-Digest algorithm. For more information, see Estimating Percentile Values.

Collation:No impact
See also:APPROX_PERCENTILE_ACCUMULATE , APPROX_PERCENTILE_COMBINE , APPROX_PERCENTILE_ESTIMATE

Syntax

APPROX_PERCENTILE( <expr> , <percentile> )

Arguments

expr
A valid expression, such as a column name, that evaluates to a numeric value.
percentile
A constant real value greater than or equal to 0.0 and less than 1.0. This indicates the percentile (from 0 to 99.999…). E.g. The value 0.65 indicates the 65th percentile.

Returns

The output is returned as a DOUBLE value.

Usage Notes

  • Percentile works only on numeric values, so expr should produce values that are numbers or can be cast to numbers.
  • The values returned are not necessarily in the data set.
  • The value returned is an approximation. The size of the data set and the skew in the data set affect the accuracy of the approximation.

Examples

Demonstrate the APPROX_PERCENTILE function:

Create and populate a table with values:

CREATE TABLE testtable (c1 INTEGER);
INSERT INTO testtable (c1) VALUES 
    (0),
    (1),
    (2),
    (3),
    (4),
    (5),
    (6),
    (7),
    (8),
    (9),
    (10);

Run queries and show the output:

SELECT APPROX_PERCENTILE(c1, 0.1) FROM testtable;
+----------------------------+
| APPROX_PERCENTILE(C1, 0.1) |
|----------------------------|
|                        1.5 |
+----------------------------+
SELECT APPROX_PERCENTILE(c1, 0.5) FROM testtable;
+----------------------------+
| APPROX_PERCENTILE(C1, 0.5) |
|----------------------------|
|                        5.5 |
+----------------------------+

Note that the value returned in this case is higher than any value actually in the data set:

SELECT APPROX_PERCENTILE(c1, 0.999) FROM testtable;
+------------------------------+
| APPROX_PERCENTILE(C1, 0.999) |
|------------------------------|
|                         10.5 |
+------------------------------+