# Using Window Functions¶

Window functions operate on windows, which are groups of rows that are related (e.g. by date or by location).

This tutorial describes all the different types of window functions supported by Snowflake:

- Window functions that support window frames.
- Window functions that are order-sensitive (aka
*analytic functions*). - Aggregate functions called as window functions.

**In this Topic:**

## Introduction to Window Functions¶

### Aggregate Functions¶

To understand window functions, you must understand aggregate functions. A brief definition is that an aggregate function is a function that takes multiple values as input, and returns 1 value. (Technically, an aggregate function can take 0, 1, or multiple rows as input, but for simplicity, we usually just say that an aggregate function takes “multiple rows”).

For example, the SUM() function takes multiple input values and then returns a single output value that is the sum of all the input rows.

For a more complete explanation of aggregate functions, see Aggregate Functions.

### Windows and Window Functions¶

A window is a group of 0, 1, or more rows. (For simplicity, we usually just say that a window contains “multiple rows”, even though in some cases it might contain 0 or 1.)

The rows in a window are related in some way. For example, the rows might be related by location (all from the same province or state) or by time (all from the same month or fiscal quarter).

To use a window, an SQL statement defines that window — for example, the statement might explicitly partition rows into separate windows based on province or state. If the statement does not explicitly define the windows, then it operates on a single window.

A function that is passed multiple rows (more precisely, a column name or expression that generates one value for each of the rows) is called a “window function”.

Because the definition of an aggregate function is “a function that takes multiple rows (values) and produces one output row (value)”. Also, because all window functions accept multiple inputs and produce one output, all window functions are aggregate functions.

The converse is not necessarily true; an aggregate function is not necessarily a window function, so the terms are not synonymous. However, most or all current Snowflake aggregate functions can be used as window functions. For a list of all aggregate functions, including the functions that can also be called as window functions, see Aggregate Functions.

### Analytic Functions¶

An analytic function is a window function for which the order of the rows in the window is significant.

The output of an analytic function depends upon:

- The individual row passed to the function.
- The values of the other rows in the window passed to the function.
- The order of all the rows in the window.

For example, consider the RANK function. If you want to calculate the rank of each of your salespeople, you must specify the criterion on which to rank them — for example, the number of dollars of product they’ve sold, the percentage of customers who are still happy with the product 3 years after purchasing it, etc. Conceptually, you can calculate the rank by putting the sales people in order (e.g. by amount sold last month, in order from highest to lowest) and then calling the RANK() function to show which salesperson sold the most, which sold the second most, etc.

The rank of each salesperson depends on both the individual sales person (e.g. the amount she sold) and on how much every other sales person sold in that same time period. In database terms, the output of the rank function depends both upon the individual row, and on all the other rows within the window.

For an analytic function, the notion of order is crucial. All analytic functions require that the rows be ordered within the window according to some criterion.

Below is a simple example of the output from a query that used the analytic function RANK() to rank salespeople by total sales over the most recent time period:

```
+-------------+-------------+------+
| Salesperson | Amount Sold | Rank |
|-------------+-------------+------|
| Smith | 2000 | 1 |
| Jones | 1500 | 2 |
| Torkelson | 1200 | 3 |
+-------------+-------------+------+
```

The SQL syntax used to generate reports like this is shown in more detail below and in Window Functions.

For analytic functions, the function must be called once per row. For example, the RANK() function returns a different value for each row (assuming no tied values), so the RANK() function is called once for each row, and is passed both that row and the entire window (so that the row’s rank can be determined by comparing it to every other row in that window).

Not all window functions are analytic functions. The following section contrasts analytic functions and non-analytic window functions in more detail.

#### Contrasting Analytic Functions and non-Analytic Aggregate Functions¶

For non-analytic window functions (and indeed for all non-analytic aggregate functions), order is not important. For example, consider the aggregate function SUM():

SELECT salesperson, amount_sold, SUM(AMOUNT_SOLD) FROM sales_table;

The output might look similar to:

+-------------+-------------+-------+ | Salesperson | Amount Sold | Total | |-------------+-------------+-------| | Smith | 2000 | 4700 | | Jones | 1500 | 4700 | | Torkelson | 1200 | 4700 | +-------------+-------------+-------+

Because the query simply sums the total sales across the entire window, the order doesn’t matter, and the sum is the same for each row. In fact, in this case the aggregate function used as a window function only needs to be called once per window, not once for each row in the window.

This is different from the requirement for an analytic function, which must be called once for each row in each window, because the output is different for each row in that window.

### Window Frames¶

A window frame contains a subset of the rows in a window. (The term is slightly confusing; in your house, your windows are inside a window frame, but inside your database, your window frame is inside a window! A better term might have been “pane” rather than “frame”.)

There are two types of window frames:

- Cumulative.
- Sliding.

#### Cumulative Window Frame¶

A cumulative window is a variable-width window that starts at a fixed point and grows with each new row. For example, “sales total so far this month” is calculated using a cumulative window that starts on the first of the month and expands as each new day’s data is added. In the table below, the third column shows the output from a cumulative window frame that calculates the sales total so far this month:

```
+--------+-------+--------------+
| Day of | Sales | Sales So Far |
| Month | Today | This Month |
|--------+-------+--------------+
| 1 | 10 | 10 |
| 2 | 10 | 20 |
| 3 | 10 | 30 |
| ... | ... | ... |
+--------+-------+--------------+
```

#### Sliding Window Frame¶

A sliding window frame is a fixed-width frame that “slides along” the rows in the window; as new rows appear, old rows are removed from the window, so that the width of the frame is always the same.

For example, a 13-week moving average of a stock price is a sliding window; for each day, it shows us the average price over the last 13 weeks (91 days), including the current day.

For example:

- On June 30th, the function returns the average price for April 1 - June 30.
- On July 1st, the function returns the average price for April 2 - July 1.
- On July 2nd, the function returns the average price for April 3 - July 2.
- Etc.

Note

Although sliding windows are fixed-width, there is an exception when there is not enough data to fill the window. For example, if record-keeping started on April 1st, then obviously from April 1st to June 29th, the sliding window would have fewer than 91 days of data in it. Only after the window finishes filling does its width remain fixed.

Here’s an illustration of a moving average using a 3-day sliding window.

This window always keeps the 3 most recent values and calculates the total within that fixed-size window. You can loosely picture this as a fixed-width window frame sliding from left to right along a timeline.

The timeline below shows the 7 days of the week.

```
|--1--|--2--|--3--|--4--|--5--|--6--|--7--|
```

Below, we show a 3-day sliding window frame as it would look on the 3rd day of the week:

```
|--1--|--2--|--3--|--4--|--5--|--6--|--7--|
|------ frame ----|
```

As you can see, this window frame covers “today” (the 3rd day of the week) and the preceding two days. On the 4th day of the week, the window frame slides rightward one day, and looks like:

```
|--1--|--2--|--3--|--4--|--5--|--6--|--7--|
|------ frame ----|
```

```
+--------+-------+---------------+
| Day of | Sales | Most Recent |
| Month | Today | 3 Days' Sales |
|--------+-------+---------------+
| 1 | 10 | 10 |
| 2 | 11 | 21 |
| 3 | 12 | 33 |
| 4 | 13 | 36 |
| 5 | 14 | 39 |
| ... | ... | ... |
+--------+-------+---------------+
```

As you can see from this example, the last column in the table contains the sum of the three most recent days’ worth of sales data. So, for example, for day 4, the sum (36) is the sum of the sales for days 2, 3, and day 4 (11 + 12 + 13).

### Illustrated Example¶

This example illustrates many of the concepts described above:

- A window frame that acts as a cumulative window.
- A window frame that acts as a sliding window.
- A window that uses an aggregate function that does not use a window frame and is neither cumulative nor sliding.
- A window used by an analytic function.

Suppose that you want a financial report that shows values based on sales over the last week:

- Daily sales.
- A “rank” column that indicates where this day ranked in sales (1st (highest), 2nd, etc. within this week). This uses a window function, specifically an analytic function.
- Sales so far this week, i.e. the “running sum” for all days from the beginning of the week up through and including the current date. This uses a window function, more specifically an aggregate function in the context of a cumulative window frame.
- Total sales for the week. This uses an aggregate function, SUM(). Although SUM() can be used as a window function, for this column it’s used as a pure aggregate function.
- The 3-day moving average, which is the average of today and the two previous days. This uses a window function, more specifically an aggregate function in a sliding window frame.

The table would look something like the following:

```
+--------+-------+------+--------------+-------------+--------------+
| Day of | Sales | Rank | Sales So Far | Total Sales | 3-Day Moving |
| Week | Today | | This Week | This Week | Average |
|--------+-------+------+--------------+-------------|--------------+
| 1 | 10 | 4 | 10 | 84 | 10.0 |
| 2 | 14 | 3 | 24 | 84 | 12.0 |
| 3 | 6 | 5 | 30 | 84 | 10.0 |
| 4 | 6 | 5 | 36 | 84 | 9.0 |
| 5 | 14 | 3 | 50 | 84 | 10.0 |
| 6 | 16 | 2 | 66 | 84 | 11.0 |
| 7 | 18 | 1 | 84 | 84 | 12.0 |
+--------+-------+------+--------------+-------------+--------------+
```

## Window Function Syntax¶

### Introduction¶

The syntax of window functions is covered in Window Functions, which is a reference topic and might seem a little terse for new users, so this tutorial provides some information about a subset of the syntax required by window functions.

### Syntax¶

Windows and window frames are specified using an `OVER`

clause:

```
OVER ( [ PARTITION BY <expr1> ]
[ ORDER BY <expr2> [ { cumulativeFrame | slidingFrame } ] ]
)
```

The `PARTITION BY`

sub-clause partitions the data into windows. For
example, if you want to calculate running sums of monthly sales for
more than one month, you could partition the data by month. That
would allow you to calculate one running sum of sales for January,
another running sum for February, etc.

The `ORDER BY`

sub-clause is primarily for analytic window functions and
for sliding and cumulative window frame functions; it determines the
order of rows within each window.

For analytic functions, the `ORDER BY`

sub-clause is required, not optional.
This is because analytic functions only operate on windows that contain rows
in a specified order.

## Examples¶

This first set of examples is based on the table used earlier in this tutorial:

```
+--------+-------+------+--------------+-------------+--------------+
| Day of | Sales | Rank | Sales So Far | Total Sales | 3-Day Moving |
| Week | Today | | This Week | This Week | Average |
|--------+-------+------+--------------+-------------|--------------+
| 1 | 10 | 4 | 10 | 84 | 10.0 |
| 2 | 14 | 3 | 24 | 84 | 12.0 |
| 3 | 6 | 5 | 30 | 84 | 10.0 |
| 4 | 6 | 5 | 36 | 84 | 9.0 |
| 5 | 14 | 3 | 50 | 84 | 10.0 |
| 6 | 16 | 2 | 66 | 84 | 11.0 |
| 7 | 18 | 1 | 84 | 84 | 12.0 |
+--------+-------+------+--------------+-------------+--------------+
```

The SQL for this query is somewhat complex. Rather than show you one query that does all this, we’ll show you the SQL for some of the individual columns.

In the real world, we’d probably have years worth of data, and to calculate sums and averages for one specific week of data, we’d need to use a window one week wide, or use a filter similar to:

```
WHERE date >= start_of_relevant_week and date <= end_of_relevant_week
```

To focus on windowing, we’ll simplify and assume that our table contains only the most recent week’s worth of data.

### Calculating Total Sales This Week¶

The simplest column is “Total Sales This Week”. It can be calculated using an aggregate function without being in a window context.

```
SELECT SUM(sales_today)
FROM my_table;
```

### Calculating Sales So Far This Week¶

The column “sales so far this week” is calculated using a window function. The SQL is similar to:

```
SELECT SUM(sales_today) OVER (ORDER BY sales_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
FROM my_table;
```

This says to put the rows in order by date, and then for each date calculate the sum of sales from the start of the window up to (and including the current date.

### Calculating a 3-Day Moving Average¶

In our simple example of a 3-day average calculated from the beginning of the week, the SQL code is similar to:

```
SELECT SUM(sales_today) OVER (ORDER BY sales_date
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
FROM my_table;
```

The difference between this sliding window and the preceding cumulative window is simply the starting point; it’s a fixed point for the cumulative window, but slides rightward for the sliding window.

**Rank:**

```
SELECT RANK(sales_today) OVER (ORDER BY sales_today)
FROM my_table;
```

### Illustrating the Syntax With More Examples¶

This section provides more detail about the windowing sub-clauses used by window functions, and how these sub-clauses work together. This section includes examples with extensive explanations.

Note that many of these examples have two `ORDER BY`

clauses, one for
the window clause, and one to put the result set in the most readable order.
In our discussions below, when we refer to the `ORDER BY`

clause, we are
usually or always referring to the `ORDER BY`

clause in the window.

These examples use the following table and data:

```
CREATE TABLE sales (sales_date DATE, quantity INTEGER);
INSERT INTO sales (sales_date, quantity) VALUES
('2018-01-01', 1),
('2018-01-02', 3),
('2018-01-03', 5),
('2018-02-01', 2)
;
```

#### Example Window With PARTITION BY clause¶

The `PARTITION BY`

sub-clause of the windowing clause divides the data
into distinct sub-sets based on the value of

; the SELECT statement is
applied to each subset, and the output has rows for each subset.*expr1*

(This is similar to, but not identical to, what the `GROUP BY`

clause does.)

The next example shows the quantity sold each month, and uses
the `PARTITION BY`

clause to divide the data into one-month subsets:

```
SELECT MONTH(sales_date) AS MONTH_NUM,
SUM(quantity) OVER (PARTITION BY MONTH(sales_date)) AS QUANTITY
FROM sales
ORDER BY sales_date;
+-----------+----------+
| MONTH_NUM | QUANTITY |
|-----------+----------|
| 1 | 9 |
| 1 | 9 |
| 1 | 9 |
| 2 | 2 |
+-----------+----------+
```

As you can see, the first three rows are duplicates. There were 3 input rows
for month #1, and the window function generates one output row for each input
row, so there are 3 output rows for month #1 in the output. The SUM() function
is not used as a cumulative or sliding function; it is applied to the entire
window and returns the same value for the entire window each time, so we
get duplicate values as shown above. You can reduce the duplicates by using
the `DISTINCT`

keyword:

```
SELECT DISTINCT MONTH(sales_date) AS MONTH_NUM,
SUM(quantity) OVER (PARTITION BY MONTH(sales_date))
FROM sales
ORDER BY MONTH_NUM;
+-----------+-----------------------------------------------------+
| MONTH_NUM | SUM(QUANTITY) OVER (PARTITION BY MONTH(SALES_DATE)) |
|-----------+-----------------------------------------------------|
| 1 | 9 |
| 2 | 2 |
+-----------+-----------------------------------------------------+
```

In this particular case, you could have used a `GROUP BY`

clause rather
than a windowing clause, for example:

```
SELECT MONTH(sales_date) AS MONTH_NUM,
SUM(quantity)
FROM sales
GROUP BY MONTH_NUM
ORDER BY MONTH_NUM;
+-----------+---------------+
| MONTH_NUM | SUM(QUANTITY) |
|-----------+---------------|
| 1 | 9 |
| 2 | 2 |
+-----------+---------------+
```

However, `GROUP BY`

is not as flexible as windowing, and in more complex
queries you can’t always substitute `GROUP BY`

for a window clause.

#### Windows With ORDER BY clause¶

The `ORDER BY`

clause controls the order of the data within each window
(and each partition if there is more than one partition).

This is useful if you want to show a “running sum” over time as new rows are added.

A running sum can be calculated either from the beginning of the window to the current row (inclusive) or from the current row to the end of the window.

A query can use a “sliding” window, which is a fixed-width window that processes N specified rows relative to the current row, for example, the 10 most recent rows (including the current row).

Both cumulative and sliding windows are explained below.

#### Example with Cumulative Window Frame¶

In a “cumulative” window frame, values are computed from the beginning of the window to the current row (or from the current row to the end of the window).

```
SELECT MONTH(sales_date) AS MONTH_NUM,
quantity,
SUM(quantity) OVER (ORDER BY MONTH(sales_date)
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
AS CUMULATIVE_SUM_QUANTITY
FROM sales
ORDER BY sales_date;
+-----------+----------+-------------------------+
| MONTH_NUM | QUANTITY | CUMULATIVE_SUM_QUANTITY |
|-----------+----------+-------------------------|
| 1 | 1 | 1 |
| 1 | 3 | 4 |
| 1 | 5 | 9 |
| 2 | 2 | 11 |
+-----------+----------+-------------------------+
```

Below we’ve shown the output with an additional “Explanation” column to make it easier to see how the CUMULATIVE_SUM_QUANTITY column was calculated:

```
+-----------+----------+-------------------------+---------------+
| MONTH_NUM | QUANTITY | CUMULATIVE_SUM_QUANTITY | EXPLANATION |
|-----------+----------+-------------------------|---------------+
| 1 | 1 | 1 | 1 |
| 1 | 3 | 4 | 1 + 3 |
| 1 | 5 | 9 | 1 + 3 + 5 |
| 2 | 2 | 11 | 1 + 3 + 5 + 2 |
+-----------+----------+-------------------------+---------------+
```

#### Example With Sliding Window Frame¶

In the financial world, analysts often study “moving averages”.

For example, you might see a graph in which the X axis is time, and the Y axis shows the average price of the stock over the last 13 weeks (the “13-week moving average”). In a graph of a 13-week moving average of a stock price, the price shown for June 30th is not the price of the stock on June 30th, but the average price of the stock for the 13 weeks up to and including June 30th (i.e. April 1st through June 30th). The value on July 1st is the average price for April 2nd through July 1st; the value on July 2nd is the average price for April 3rd through July 2nd, and so on. Each day, we add the most recent day’s value to the moving average, and we remove the oldest day’s value. This smooths out day-to-day fluctuations and can make trends easier to recognize.

Moving averages can be calculated by using a “sliding window”. The window has a specific width in rows. In the stock price example given above, 13 weeks is 91 days, so the sliding window would be 91 days. If the measurements are taken once per day (e.g. at the end of the day), then the window would be 91 rows “wide”.

To define a window 91 rows wide, use a clause similar to the following:

SELECT AVG(price) OVER (ORDER BY timestamp1 ROWS BETWEEN 90 PRECEDING AND CURRENT ROW);

(When the window starts out, it might be less than 91 days wide. For example, if you want the 13-week moving average price of a stock that was created on April 1st, then of course on April 3rd you’ll have only 3 prices so far, so the window will be only 3 rows wide, not 91 rows wide.)

The example below shows the result of summing over a sliding window wide enough to hold two samples:

```
SELECT MONTH(sales_date) AS MONTH_NUM,
quantity,
SUM(quantity) OVER (ORDER BY sales_date
ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
AS SLIDING_SUM_QUANTITY
FROM sales
ORDER BY sales_date;
+-----------+----------+----------------------+
| MONTH_NUM | QUANTITY | SLIDING_SUM_QUANTITY |
|-----------+----------+----------------------|
| 1 | 1 | 1 |
| 1 | 3 | 4 |
| 1 | 5 | 8 |
| 2 | 2 | 7 |
+-----------+----------+----------------------+
```

Below we’ve shown the output with an additional “Explanation” column to make it easier to see how the CUMULATIVE_SUM_QUANTITY column was calculated:

```
+-----------+----------+----------------------+---------------------------------------+
| MONTH_NUM | QUANTITY | SLIDING_SUM_QUANTITY | Explanation |
|-----------+----------+----------------------+---------------------------------------|
| 1 | 1 | 1 | 1 |
| 1 | 3 | 4 | 1 + 3 |
| 1 | 5 | 8 | 3 + 5 (1 is no longer in the window) |
| 2 | 2 | 7 | 5 + 2 |
+-----------+----------+----------------------+---------------------------------------+
```

Note that the “sliding window” functionality can’t be used without the
`ORDER BY`

clause; the sliding window must know the order in which rows
enter and exit the sliding window.

#### Example Combining PARTITION BY and ORDER BY Clauses¶

You can combine `PARTITION BY`

and `ORDER BY`

clauses to get
running sums within partitions. In this example, the partitions are one month,
and because the sums apply only within a partition, the sum is reset to 0 at
the beginning of each new month.

```
SELECT MONTH(sales_date) AS MONTH_NUM,
SUM(quantity) OVER (PARTITION BY MONTH(sales_date) ORDER BY sales_date)
AS MONTHLY_CUMULATIVE_SUM_QUANTITY
FROM sales
ORDER BY sales_date;
+-----------+---------------------------------+
| MONTH_NUM | MONTHLY_CUMULATIVE_SUM_QUANTITY |
|-----------+---------------------------------|
| 1 | 1 |
| 1 | 4 |
| 1 | 9 |
| 2 | 2 |
+-----------+---------------------------------+
```

Below we’ve shown the output with an additional “Explanation” column to make it easier to see how the CUMULATIVE_SUM_QUANTITY column was calculated:

```
+-----------+---------------------------------+-------------+
| MONTH_NUM | MONTHLY_CUMULATIVE_SUM_QUANTITY | Explanation |
|-----------+---------------------------------+-------------|
| 1 | 1 | 1 |
| 1 | 4 | 1 + 3 |
| 1 | 9 | 1 + 3 + 5 |
| 2 | 2 | 2 |
+-----------+---------------------------------+-------------+
```

You can combine partitions and sliding windows. In the example below, the sliding window is usually two rows wide, but each time a new partition (i.e a new month) is reached, the sliding window starts with only the first row in that partition:

```
SELECT
MONTH(sales_date) AS MONTH_NUM,
quantity,
SUM(quantity) OVER (PARTITION BY MONTH(sales_date)
ORDER BY sales_date
ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
AS MONTHLY_SLIDING_SUM_QUANTITY
FROM sales
ORDER BY sales_date;
+-----------+----------+------------------------------+
| MONTH_NUM | QUANTITY | MONTHLY_SLIDING_SUM_QUANTITY |
|-----------+----------+------------------------------|
| 1 | 1 | 1 |
| 1 | 3 | 4 |
| 1 | 5 | 8 |
| 2 | 2 | 2 |
+-----------+----------+------------------------------+
```

```
+-----------+----------+------------------------------+---------------+
| MONTH_NUM | QUANTITY | MONTHLY_SLIDING_SUM_QUANTITY | Explanation |
|-----------+----------+------------------------------+---------------|
| 1 | 1 | 1 | 1 |
| 1 | 3 | 4 | 1 + 3 |
| 1 | 5 | 8 | 3 + 5 |
| 2 | 2 | 2 | 2 (new month) |
+-----------+----------+------------------------------+---------------+
```