Hive基礎（七）：Hive語法(3) DML(2) DQL(1)基本查詢/Where 語句/分組

本文轉載自查看原文 2020-07-22 20:40 703 HIVE

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select

查詢語句語法：

[WITH CommonTableExpression (, CommonTableExpression)*] (Note: Only available
starting with Hive 0.13.0)
SELECT [ALL | DISTINCT] select_expr, select_expr, ...
FROM table_reference
[WHERE where_condition]
[GROUP BY col_list]
[ORDER BY col_list]
[CLUSTER BY col_list
| [DISTRIBUTE BY col_list] [SORT BY col_list]
]
[LIMIT number]

1 基本查詢（Select…From）

1.1 全表和特定列查詢

0）數據准備

（0）原始數據

dept:

10 ACCOUNTING 1700
20 RESEARCH 1800
30 SALES 1900
40 OPERATIONS 1700

emp：

7369 SMITH CLERK 7902 1980-12-17 800.00 20
7499 ALLEN SALESMAN 7698 1981-2-20 1600.00 300.00 30
7521 WARD SALESMAN 7698 1981-2-22 1250.00 500.00 30
7566 JONES MANAGER 7839 1981-4-2 2975.00 20
7654 MARTIN SALESMAN 7698 1981-9-28 1250.00 1400.00 30
7698 BLAKE MANAGER 7839 1981-5-1 2850.00 30
7782 CLARK MANAGER 7839 1981-6-9 2450.00 10
7788 SCOTT ANALYST 7566 1987-4-19 3000.00 20
7839 KING PRESIDENT 1981-11-17 5000.00 10
7844 TURNER SALESMAN 7698 1981-9-8 1500.00 0.00 30
7876 ADAMS CLERK 7788 1987-5-23 1100.00 20
7900 JAMES CLERK 7698 1981-12-3 950.00 30
7902 FORD ANALYST 7566 1981-12-3 3000.00 20
7934 MILLER CLERK 7782 1982-1-23 1300.00 10

創建部門表

create table if not exists dept(
deptno int,
dname string,
loc int
)
row format delimited fields terminated by '\t';

創建員工表

create table if not exists emp(
empno int,
ename string,
job string,
mgr int,
hiredate string, 
sal double, 
comm double,
deptno int)
row format delimited fields terminated by '\t';

導入數據

hive (default)> load data local inpath '/opt/module/datas/dept.txt' into table
dept;
hive (default)> load data local inpath '/opt/module/datas/emp.txt' into table emp;

1．全表查詢

hive (default)> select * from emp;

2．選擇特定列查詢

hive (default)> select empno, ename from emp;

注意：

（1）SQL 語言 大小寫不敏感。

（2）SQL 可以寫在一行或者多行

（3）關鍵字不能被縮寫也不能分行

（4）各子句一般要分行寫。

（5）使用縮進提高語句的可讀性。

1.2 列別名

1．重命名一個列

2．便於計算

3．緊跟列名，也可以在列名和別名之間加入關鍵字‘AS’

4．案例實操

查詢名稱和部門

hive (default)> select ename AS name, deptno dn from emp;

1.3 算術運算符

查詢出所有員工的薪水后加 1 顯示。

hive (default)> select sal +1 from emp;

1.4 常用函數

1．求總行數（count）

hive (default)> select count(*) cnt from emp;

2．求工資的最大值（max）

hive (default)> select max(sal) max_sal from emp;

3．求工資的最小值（min）

hive (default)> select min(sal) min_sal from emp;

4．求工資的總和（sum）

hive (default)> select sum(sal) sum_sal from emp;

5．求工資的平均值（avg）

hive (default)> select avg(sal) avg_sal from emp;

1.5 Limit 語句

典型的查詢會返回多行數據。LIMIT 子句用於限制返回的行數。

hive (default)> select * from emp limit 5;

2 Where 語句

1．使用 WHERE 子句，將不滿足條件的行過濾掉

2．WHERE 子句緊隨 FROM 子句

3．案例實操

查詢出薪水大於 1000 的所有員工

hive (default)> select * from emp where sal >1000;

注意：where 子句中不能使用字段別名。

2.1 比較運算符（Between/In/ Is Null）

1）下面表中描述了謂詞操作符，這些操作符同樣可以用於 JOIN…ON 和 HAVING 語句中。

2）案例實操
（1）查詢出薪水等於 5000 的所有員工
hive (default)> select * from emp where sal =5000;
（2）查詢工資在 500 到 1000 的員工信息
hive (default)> select * from emp where sal between 500 and 1000;
（3）查詢 comm 為空的所有員工信息
hive (default)> select * from emp where comm is null;
（4）查詢工資是 1500 或 5000 的員工信息
hive (default)> select * from emp where sal IN (1500, 5000);

2.2 Like 和 RLike

1）使用 LIKE 運算選擇類似的值

2）選擇條件可以包含字符或數字:

% 代表零個或多個字符(任意個字符)。

_ 代表一個字符。

3）RLIKE 子句是 Hive 中這個功能的一個擴展，其可以通過 Java 的正則表達式這個更

強大的語言來指定匹配條件。

4）案例實操

（1）查找以 2 開頭薪水的員工信息
hive (default)> select * from emp where sal LIKE '2%';
（2）查找第二個數值為 2 的薪水的員工信息
hive (default)> select * from emp where sal LIKE '_2%';
（3）查找薪水中含有 2 的員工信息
hive (default)> select * from emp where sal RLIKE '[2]';

2.3 邏輯運算符（And/Or/Not）

案例實操

（1）查詢薪水大於 1000，部門是 30
hive (default)> select * from emp where sal>1000 and deptno=30;
（2）查詢薪水大於 1000，或者部門是 30
hive (default)> select * from emp where sal>1000 or deptno=30;
（3）查詢除了 20 部門和 30 部門以外的員工信息
hive (default)> select * from emp where deptno not IN(30, 20);

3 分組

3.1 Group By 語句

GROUP BY 語句通常會和聚合函數一起使用，按照一個或者多個列隊結果進行分組，

然后對每個組執行聚合操作。

案例實操：

（1）計算 emp 表每個部門的平均工資
hive (default)> select t.deptno, avg(t.sal) avg_sal from emp t group by t.deptno;
（2）計算 emp 每個部門中每個崗位的最高薪水
hive (default)> select t.deptno, t.job, max(t.sal) max_sal from emp t group by
t.deptno, t.job;

3.2 Having 語句

1．having 與 where 不同點

（1）where 后面不能寫分組函數，而 having 后面可以使用分組函數。

（2）having 只用於 group by 分組統計語句。

2．案例實操

（1）求每個部門的平均薪水大於 2000 的部門

求每個部門的平均工資
hive (default)> select deptno, avg(sal) from emp group by deptno;
 求每個部門的平均薪水大於 2000 的部門
hive (default)> select deptno, avg(sal) avg_sal from emp group by deptno having
avg_sal > 2000;

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Hive基礎(14)：HIVE語法(7)DML(4)DQL(3)流程控制語句(一) IF Flink基礎（二十六）：FLINK-SQL語法(二)DQL(二)查詢語句（二）操作符（一） Flink基礎（三十一）：FLINK-SQL語法(七)DML(一)INSERT 語句 SQL語句之DQL數據查詢語言（select、desc） DQL、DML、DDL、DCL的概念與區別 DML語句 Hive(7)-基本查詢語句 Clickhouse執行處理查詢語句（包括DDL，DML）的過程 Hadoop Hive基礎sql語法 SQL查詢效率where語句條件