- 窗口函数对一组查询行执行类似聚合的操作。但是,聚合操作将查询行分组到单个结果行中,而窗口函数则为每个查询行生成一个结果:
- 对其进行函数求值的行称为当前行。
- 与进行函数计算的当前行相关的查询行构成当前行的窗口。
函数 |
说明 |
CUME_DIST() |
累计分布值 |
DENSE_RANK() |
其分区中当前行的排名,无间隙 |
FIRST_VALUE() |
窗口框架第一行中的参数值 |
LAG() |
分区中滞后当前行的行的参数值 |
LAST_VALUE() |
窗口框架最后一行的参数值 |
LEAD() |
分区中行前导当前行的参数值 |
NTH_VALUE() |
窗口框架第 N 行中的参数值 |
NTILE() |
其分区中当前行的存储桶编号。 |
PERCENT_RANK() |
百分比排名值 |
RANK() |
当前行在其分区内的排名,带有间隙 |
ROW_NUMBER() |
其分区中的当前行数 |
mysql> SELECT * FROM sales ORDER BY country, year, product;
+--------+-----------+--------------+----------+
| year | country | product | profit |
+--------+-----------+--------------+----------+
| 2000 | Finland | Computer | 1500 |
| 2000 | Finland | Phone | 100 |
| 2001 | Finland | Phone | 10 |
| 2000 | India | Calculator | 75 |
| 2000 | India | Calculator | 75 |
| 2000 | India | Computer | 1200 |
| 2000 | USA | Calculator | 75 |
| 2000 | USA | Computer | 1500 |
| 2001 | USA | Calculator | 50 |
| 2001 | USA | Computer | 1500 |
| 2001 | USA | Computer | 1200 |
| 2001 | USA | TV | 150 |
| 2001 | USA | TV | 100 |
+--------+------------+--------------+-----------+
- 例如,使用销售信息表,这两个查询执行聚合操作,这些操作为作为一个组采用的所有行生成单个全局总和,并按国家/地区分组总和:
mysql>SELECT SUM(profit) AS total_profit
FROM sales;
+--------------+
|total_profit |
+--------------+
|7535 |
+--------------+
mysql>SELECT country, SUM(profit) AS country_profit
FROM sales
GROUPBY country
ORDERBY country;
+---------+------------------+
|country | country_profit |
+---------+------------------+
| Finland | 1610 |
| India | 1350 |
| USA | 4575 |
+---------+------------------+
- 相比之下,窗口操作不会将查询行组折叠为单个输出行。相反,它们为每行生成一个结果。与前面的查询一样,以下查询使用 SUM(),但这次是作为窗口函数:
mysql>SELECT year, country, product, profit,
SUM(profit) OVER() AS total_profit,
SUM(profit)OVER (PARTITIONBY country) AS country_profit
FROM sales
ORDERBY country, year, product, profit;+-------+----------+------------+---------+---------------+------------------+
| year | country | product | profit | total_profit | country_profit | +-------+----------+------------+---------+---------------+------------------+
| 2000 | Finland | Computer | 1500 | 7535 | 1610 |
| 2000 | Finland | Phone | 100 | 7535 | 1610 |
| 2001 | Finland | Phone | 10 | 7535 | 1610 |
| 2000 | India | Calculator | 75 | 7535 | 1350 |
| 2000 | India | Calculator | 75 | 7535 | 1350 |
| 2000 | India | Computer | 1200 | 7535 | 1350 |
| 2000 | USA | Calculator | 75 | 7535 | 4575 |
| 2000 | USA | Computer | 1500 | 7535 | 4575 |
| 2001 | USA | Calculator | 50 | 7535 | 4575 |
| 2001 | USA | Computer | 1200 | 7535 | 4575 |
| 2001 | USA | Computer | 1500 | 7535 | 4575 |
| 2001 | USA | TV | 100 | 7535 | 4575 |
| 2001 | USA | TV | 150 | 7535 | 4575 | +-------+-----------+------------+---------+---------------+------------------+
- 窗口函数对一组查询行执行类似聚合的操作。但是,虽然聚合操作将查询行分组为单个结果行,但窗口函数会为每个查询行生成一个结果:
- 发生函数评估的行称为当前行。
- 与发生函数评估的当前行相关的查询行构成当前行的窗口。
- 例如,使用销售信息表,这两个查询执行聚合操作,为作为一个组的所有行生成一个全局总和,并按国家/地区分组:
mysql> SELECT SUM(profit) AS total_profit FROM sales;
+---------------+
| total_profit |
+---------------+
| 7535 |
+---------------+
mysql> SELECT country, SUM(profit) AS country_profit
FROM sales
GROUP BY country
ORDER BY country;
+----------+------------------+
| country | country_profit |
+----------+------------------+
| Finland | 1610 |
| India | 1350 |
| USA | 4575 |
+----------+------------------+
- 相比之下,窗口操作不会将查询行组折叠为单个输出行。相反,它们为每一行生成一个结果。与前面的查询一样,以下查询使用 SUM(),但这次是作为窗口函数:
mysql> SELECT year, country, product, profit,
SUM(profit) OVER() AS total_profit,
SUM(profit) OVER(PARTITION BY country) AS country_profit
FROM sales
ORDER BY country, year, product, profit;
+-------+-------------+-------------+--------+-----------------+------------------+
| year | country | product | profit | total_profit | country_profit |
+-------+-------------+-------------+--------+-----------------+------------------+
| 2000 | Finland | Computer | 1500 | 7535 | 1610 |
| 2000 | Finland | Phone | 100 | 7535 | 1610 |
| 2001 | Finland | Phone | 10 | 7535 | 1610 |
| 2000 | India | Calculator | 75 | 7535 | 1350 |
| 2000 | India | Calculator | 75 | 7535 | 1350 |
| 2000 | India | Computer | 1200 | 7535 | 1350 |
| 2000 | USA | Calculator | 75 | 7535 | 4575 |
| 2000 | USA | Computer | 1500 | 7535 | 4575 |
| 2001 | USA | Calculator | 50 | 7535 | 4575 |
| 2001 | USA | Computer | 1200 | 7535 | 4575 |
| 2001 | USA | Computer | 1500 | 7535 | 4575 |
| 2001 | USA | TV | 100 | 7535 | 4575 |
| 2001 | USA | TV | 150 | 7535 | 4575 | +--------+------------+--------------+--------+----------------+------------------+
- 查询中的每个窗口操作都通过包含一个 OVER 子句来表示,该子句指定如何将查询行划分为组以供窗口函数处理:
- 第一个 OVER 子句是空的,它将整个查询行集视为一个分区。窗口函数因此产生一个全局和,但对每一行都这样做。
- 第二个 OVER 子句按国家划分行,产生每个分区(每个国家)的总和。该函数为每个分区行生成此总和。
- 仅在选择列表和 ORDER BY 子句中允许使用窗口函数。查询结果行由 FROM 子句确定,在 WHERE、GROUP BY 和 HAVING 处理之后,窗口执行发生在 ORDER BY、LIMIT 和 SELECT DISTINCT 之前。
- 许多聚合函数都允许使用 OVER 子句,因此可以将其用作窗口或非窗口函数,具体取决于是否存在 OVER 子句
- 作为这些非聚合窗口函数之一的示例,此查询使用 ROW_NUMBER(),它生成其分区内每一行的行号。在这种情况下,行按国家/地区编号。默认情况下,分区行是无序的,行编号是不确定的。
- 要对分区行进行排序,请在窗口定义中包含 ORDER BY 子句。该查询使用无序和有序分区(row_num1 和 row_num2 列)来说明省略和包含 ORDER BY 之间的区别:
mysql> SELECT year, country, product, profit,
ROW_NUMBER() OVER(PARTITION BY country) AS row_num1,
ROW_NUMBER() OVER(PARTITION BY country ORDER BY year, product) AS row_num2 FROM sales;
+-------+----------+--------------+----------+-------------+-------------+
| year | country | product | profit | row_num1 | row_num2 |
+-------+----------+--------------+----------+-------------+-------------+
| 2000 | Finland | Computer | 1500 | 2 | 1 |
| 2000 | Finland | Phone | 100 | 1 | 2 |
| 2001 | Finland | Phone | 10 | 3 | 3 |
| 2000 | India | Calculator | 75 | 2 | 1 |
| 2000 | India | Calculator | 75 | 3 | 2 |
| 2000 | India | Computer | 1200 | 1 | 3 |
| 2000 | USA | Calculator | 75 | 5 | 1 |
| 2000 | USA | Computer | 1500 | 4 | 2 |
| 2001 | USA | Calculator | 50 | 2 | 3 |
| 2001 | USA | Computer | 1500 | 3 | 4 |
| 2001 | USA | Computer | 1200 | 7 | 5 |
| 2001 | USA | TV | 150 | 1 | 6 |
| 2001 | USA | TV | 100 | 6 | 7 |
+-------+----------+---------------+---------+-------------+-------------+
来自 <https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html>