mysql窗口函数及用法
首先推荐:MYSQL窗口函数 - 知乎 (zhihu.com)这篇文章,写得非常详细
含义:窗口函数也叫OLAP函数(Online Anallytical Processing,联机分析处理),可以对数据进行实时分析处理。
tips: 例子来源于leetcode or 牛客网
分类:
- 专用窗口函数:rank(),dense_rank(),row_number()
- 汇总函数:max(),min(),count(),sum(),avg()
语法:
select 窗口函数 over (partition by 用于分组的列名, order by 用于排序的列名)
一、rank() 函数
说明
- rank()是排序函数,括号中不需要有参数;
- 通过partition by将班级分类,相当于之前用过的group by子句功能,但是group by子句分类汇总会改变原数据的行数,而用窗口函数自救保持原行数;
- 通过order by将成绩降序排列,与之前学的order by子句用法一样,后边可以升序asc或者降序desc;
注意:窗口函数是对where后者group by子句处理后的结果进行操作,因此按照SQL语句的运行顺序,窗口函数一般放在select子句中。
数据表:
+----+-------+
| Id | Score |
+----+-------+
| 1 | 3.50 |
| 2 | 3.65 |
| 3 | 4.00 |
| 4 | 3.85 |
| 5 | 4.00 |
| 6 | 3.65 |
+----+-------+
leetcode 178. 分数排名
在rank()函数,如果有并列情况,会占用下一个名次的位置
SELECT Score ,rank() over(order by Score desc) as 'Rank' from Scores;
//{"headers": ["Score", "Rank"], "values": [[4.00, 1], [4.00, 1], [3.85, 3], [3.65, 4], [3.65, 4], [3.50, 6]]}
在dense_rank()函数,如果有并列情况,则不会占用下一个名次的位置
SELECT Score ,dense_rank() over(order by Score desc) as 'Rank' from Scores;
//{"headers": ["Score", "Rank"], "values": [[4.00, 1], [4.00, 1], [3.85, 2], [3.65, 3], [3.65, 3], [3.50, 4]]}
在row_number()函数中,会忽略并列的情况
SELECT Score ,row_number() over(order by Score desc) as 'Rank' from Scores;
{"headers": ["Score", "Rank"], "values": [[4.00, 1], [4.00, 1], [3.85, 2], [3.65, 3], [3.65, 3], [3.50, 4]]}
二、聚合函数 ( leetcode 184. 部门工资最高的员工)
Employee 表包含所有员工信息,每个员工有其对应的 Id, salary 和 department Id。
+----+-------+--------+--------------+
| Id | Name | Salary | DepartmentId |
+----+-------+--------+--------------+
| 1 | Joe | 70000 | 1 |
| 2 | Jim | 90000 | 1 |
| 3 | Henry | 80000 | 2 |
| 4 | Sam | 60000 | 2 |
| 5 | Max | 90000 | 1 |
+----+-------+--------+--------------+
Department 表包含公司所有部门的信息。
+----+----------+
| Id | Name |
+----+----------+
| 1 | IT |
| 2 | Sales |
+----+----------+
作用:聚合函数作为窗口函数,是起到"累加/累计"的效果,比如,就是截止到本行,最大值?最小值是多少
与专用窗口函数的区别:括号中需要有指定列,不能为空
select Department,Employee,Salary from (select d.Name Department,e.Name Employee,e.Salary, Max(Salary) over (PARTITION BY d.Name) as max from Employee e join Department d on e.DepartmentId = d.Id) s where Salary = max
//{"headers": ["Department", "Employee", "Salary"], "values": [["IT", "Jim", 90000], ["IT", "Max", 90000], ["Sales", "Henry", 80000]]}