窗口函数支持distinct过滤
功能描述
Vastbase在Oracle兼容模式下,支持在使用窗口聚集函数的条件下使用distinct去重过滤,其中包括函数有:avg、count、max、min、stddev、sum、variance。
窗口函数:在包含over子句的sql中调用的函数,窗口函数必须与over语句一起使用。
注意事项
该功能仅在数据库兼容模式为Oracle时能够使用(即创建DB时DBCOMPATIBILITY='A'),在其他数据库兼容模式下不能使用该特性。
对于支持dictinct的7个函数,其中max与min可以在over子句中添加order by,其他函数则不可以添加。
列存表仅支持在variance函数中使用distinct功能。
示例
前置步骤: 创建并切换至兼容模式为Oracle的数据库db_oracle。
CREATE DATABASE db_oracle dbcompatibility='A';
\c db_oracle
示例1: 在SQL中使用多个窗口函数。
1、创建测试表并插入数据。
CREATE TABLE employee (id INT PRIMARY KEY,name VARCHAR(50),manager_id INT,salary INT);
INSERT INTO employee (id, name, manager_id, salary) VALUES (1, 'Alice', 3, 5000);
INSERT INTO employee (id, name, manager_id, salary) VALUES (2, 'Bob', 3, 6000);
INSERT INTO employee (id, name, manager_id, salary) VALUES (3, 'Charlie', 4, 8000);
INSERT INTO employee (id, name, manager_id, salary) VALUES (4, 'David', null, 10000);
INSERT INTO employee (id, name, manager_id, salary) VALUES (5, 'Eve', 3, 7000);
INSERT INTO employee (id, name, manager_id, salary) VALUES (6, 'Frank', 5, 9000);
INSERT INTO employee (id, name, manager_id, salary) VALUES (7, 'Lily', 1, 5000);
2、查询employee表数据。
select * from employee;
返回结果为:
id | name | manager_id | salary
----+---------+------------+--------
1 | Alice | 3 | 5000
2 | Bob | 3 | 6000
3 | Charlie | 4 | 8000
4 | David | | 10000
5 | Eve | 3 | 7000
6 | Frank | 5 | 9000
7 | Lily | 1 | 5000
(7 rows)
3、执行如下命令使用窗口函数去重。
select id,name, salary , sum(distinct salary) over( ) , avg(distinct salary) over() from employee;
select id,name, salary , min(distinct salary) over( ),stddev(distinct salary) over(partition by salary ) , variance(distinct salary) over(partition by salary) from employee;
返回结果依次为:
id | name | salary | sum | avg
----+---------+--------+-------+------
1 | Alice | 5000 | 45000 | 7500
2 | Bob | 6000 | 45000 | 7500
3 | Charlie | 8000 | 45000 | 7500
4 | David | 10000 | 45000 | 7500
5 | Eve | 7000 | 45000 | 7500
6 | Frank | 9000 | 45000 | 7500
7 | Lily | 5000 | 45000 | 7500
(7 rows)
id | name | salary | min | stddev | variance
----+---------+--------+------+--------+----------
7 | Lily | 5000 | 5000 | |
1 | Alice | 5000 | 5000 | |
2 | Bob | 6000 | 5000 | |
5 | Eve | 7000 | 5000 | |
3 | Charlie | 8000 | 5000 | |
6 | Frank | 9000 | 5000 | |
4 | David | 10000 | 5000 | |
(7 rows)
示例2:在存储过程中使用窗口函数。
1、创建测试表并插入数据。
create table test1(id int,date timestamp);
insert into test1 values(1,'1111-01-01 11:11:00.1111'),(2,'2222-11-02 00:00:00.123');
insert into test1 values(2,null);
create table test2(id int,dis numeric);
2、查询test1表数据。
select * from test1;
返回结果为:
id | date
----+--------------------------
1 | 1111-01-01 11:11:00.1111
2 | 2222-11-02 00:00:00.123
2 |
(3 rows)
3、创建存储过程并在存储过程中使用窗口函数。
CREATE OR REPLACE PROCEDURE pro1()
AS
BEGIN
insert into test2 select id,count(distinct id) over() from test1;
insert into test2 select id,sum(distinct id) over(partition by date) from test1;
insert into test2 select id,avg(distinct id) over(partition by date) from test1;
insert into test2 select id,max(distinct id) over(order by id) from test1;
insert into test2 select id,min(distinct id) over(partition by date) from test1;
insert into test2 select id,stddev(distinct id) over(partition by date) from test1;
insert into test2 select id,variance(distinct id) over() from test1;
END;
/
4、调用存储过程。
select pro1();
返回如下结果表示调用成功:
pro1
------
(1 row)
5、查询表test2数据。
select * from test2;
返回结果如下:
id | dis
----+-----
1 | 2
2 | 2
2 | 2
1 | 1
2 | 2
2 | 2
1 | 1
2 | 2
2 | 2
1 | 1
2 | 2
2 | 2
1 | 1
2 | 2
2 | 2
1 |
2 |
2 |
1 | .5
2 | .5
2 | .5
(21 rows)