概述
数据查询:
SELECT [ALL | DISTINCT] select_expr, select_expr, ...
FROM table_reference
[WHERE where_condition]
[GROUP BY col_list]
[ORDER BY col_list]
[CLUSTER BY col_list
| [DISTRIBUTE BY col_list] [SORT BY col_list]
]
[LIMIT [offset,] rows]
全表查询:
select * from emp;
指定字段查询:
select t.empno,t.ename,t.job from emp t;
限制查询
select * from emp limit 5;
= >= <= between and
select t.empno,t.ename,t.job from emp t where t.sal between 800 and 1500;
is null/is not null /in /not in
select t.empno,t.ename,t.job from emp t where t.comm is null;
max/min/count/sum/avg
select count(*) cnt from emp;
select max(sal) max_sal from emp;
select sum(sal) sum_sal from emp;
select avg(sal) avg_sal from emp;
group by /having
每个部门的平均工资
select t.deptno,avg(sal) avg_sal from emp t group by t.deptno;
每个部门中最高岗位的薪水
select t.deptno,t.job,max(t.sal) max_sal from emp t group by t.deptno,t.job;
having
where是针对单条记录进行筛选过滤的
having是对分组结果进行筛选过滤的
求每个部门的平均薪水大于2000
select t.deptno,avg(t.sal) avg_sal from emp t group by t.deptno having avg_sal > 2000;
join
两个表进行连接,m表中的记录和n表的中记录组成一条记录
等值join:join ... on ...
select e.empno,e.ename,e.deptno from emp e join dept d on e.deptno = d.deptno ;
右连接:right join(以右表为准)
select e.empno,e.ename,e.deptno from emp e right join dept d on e.deptno = d.deptno ;
左连接:left join(以左表为准)
select e.empno,e.ename,e.deptno from emp e left join dept d on e.deptno = d.deptno ;
全连接
select e.empno,e.ename,e.deptno from emp e full join dept d on e.deptno = d.deptno ;
order by:对全局数据进行排序,仅仅只有一个reduce
select * from emp order by empno desc;
sort by:对每一个reduce内部数据进行排序,对全局的结果集来说没有排序
设置mapreduce个数:
set mapreduce.job.reduces= 3;
查看设置:
set mapreduce.job.reduces
insert overwrite local directory '/home/wql/app/hData/sortby-res' select * from emp sort by empno asc;
distribute by:类似于MapReduce中分区partition,对数据进行分区,结合sort by进行使用
insert overwrite local directory '/home/wql/app/hData/distby-res' select * from emp distribute by deptno sort by empno asc;
cluster by当distribute by和sort by字段相同时,就可以使用cluster by;
insert overwrite local directory '/home/wql/app/hData/clusby-res' select * from emp cluster by empno;
查看所有函数:
show functions;
desc function split;
desc function extended split;
最后
以上就是快乐毛巾为你收集整理的Hive查询实例的全部内容,希望文章能够帮你解决Hive查询实例所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
发表评论 取消回复