我是靠谱客的博主 震动戒指,最近开发中收集的这篇文章主要介绍HBase与hive集成,觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

集成环境:hadoop-2.6.0(Master,Slave1,Slave2),hbase-0.98.6-hadoop2,hive-1.2.1

1. hive和hbase集成需要的jar包有guava,hbase-common,hbase-server,hbase-client,hbase-protocol,hbase-it,htrace-core这七个jar包。

进入$HIVE_HOME/lib下以及$HBASE_HOME/lib下,看hive和hbase下的guava的jar包版本是否相同,如果不相同,在hive/lib下执行命令

rm -rf guava-XX.jar
删除hive里的guava的jar包,然后在hbase/lib执行命令,将guava-12.0.1.jar包拷贝到hive/lib目录下,并将其余的六个jar包也拷贝到hive/lib目录下

[root@Master lib]# cp guava-12.0.1.jar /usr/soft/hive-1.2.1/lib/

[root@Master lib]# cp hbase-common-0.98.6-hadoop2.jar /usr/soft/hive-1.2.1/lib/
[root@Master lib]# cp hbase-server-0.98.6-hadoop2.jar /usr/soft/hive-1.2.1/lib/
[root@Master lib]# cp hbase-client-0.98.6-hadoop2.jar /usr/soft/hive-1.2.1/lib/
[root@Master lib]# cp hbase-protocol-0.98.6-hadoop2.jar /usr/soft/hive-1.2.1/lib/
[root@Master lib]# cp hbase-it-0.98.6-hadoop2.jar /usr/soft/hive-1.2.1/lib/
[root@Master lib]# cp htrace-core-2.04.jar /usr/soft/hive-1.2.1/lib/

2. 修改hive/conf下的hive-site.xml配置文件,在最后添加如下属性

<property>
<name>hbase.zookeeper.quorum</name>
<value>Master</value>
</property>


3. 启动hive,HBase与hive集成有两种方式,第一种是创建表管理表hbase_table_1,指定数据存储在hbase表中


hive (default)> CREATE TABLE hbase_table_1(key int, value string)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")
> TBLPROPERTIES ("hbase.table.name" = "xyz");
OK
Time taken: 4.938 seconds

 在hbase中查看是否创建xyz表

hbase(main):004:0> list
TABLE
basic
sub_user
test
xyz
4 row(s) in 0.0310 seconds
=> ["basic", "sub_user", "test", "xyz"]


 往hive表hbase_table_1表中插入数据


hive (default)> insert overwrite table hbase_table_1 select empno, ename from emp;
Query ID = root_20170825100051_a6ec3c4e-f78c-4a63-9db3-291d9e73c0f9
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1502944428192_0002, Tracking URL = http://10.226.118.24:8888/proxy/application_1502944428192_0002/
Kill Command = /usr/soft/hadoop-2.6.0/bin/hadoop job
-kill job_1502944428192_0002
Hadoop job information for Stage-0: number of mappers: 1; number of reducers: 0
2017-08-25 10:01:09,798 Stage-0 map = 0%,
reduce = 0%
2017-08-25 10:01:12,978 Stage-0 map = 100%,
reduce = 0%, Cumulative CPU 1.72 sec
MapReduce Total cumulative CPU time: 1 seconds 720 msec
Ended Job = job_1502944428192_0002
MapReduce Jobs Launched:
Stage-Stage-0: Map: 1
Cumulative CPU: 1.72 sec
HDFS Read: 9729 HDFS Write: 263148 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 720 msec
OK
empno	ename
Time taken: 22.516 seconds


 查看hbase表xyz中是否有数据

hbase(main):006:0> scan 'xyz'
ROW
COLUMN+CELL
7369
column=cf1:val, timestamp=1503626471989, value=SMITH
7499
column=cf1:val, timestamp=1503626471989, value=ALLEN
7521
column=cf1:val, timestamp=1503626471989, value=WARD
7566
column=cf1:val, timestamp=1503626471989, value=JONES
4 row(s) in 0.0800 seconds

4. 第二中方式是创建外部表hbase_test,hbase中已经有test表

hbase(main):004:0> list
TABLE
basic
sub_user
test
xyz
4 row(s) in 0.0240 seconds
=> ["basic", "sub_user", "test", "xyz"]
hbase(main):003:0> scan 'test'
ROW
COLUMN+CELL
10002
column=cf:age, timestamp=1502847463784, value=56
10002
column=cf:name, timestamp=1502847451295, value=zhangsan
10003
column=cf:age, timestamp=1503279594383, value=35
10003
column=cf:name, timestamp=1502847534361, value=zhaoliu
2 row(s) in 0.2230 seconds

hive (default)> create external table hbase_test(id int, name string, age int)
              > stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
              > with serdeproperties ("hbase.columns.mapping" = ":key,cf:name,cf:age")
              > tblproperties ("hbase.table.name" = "test");
OK
Time taken: 2.619 seconds

hive (default)> select * from hbase_test ;
OK
hbase_test.id	hbase_test.name	hbase_test.age
10002	zhangsan	56
10003	zhaoliu	35
Time taken: 0.595 seconds, Fetched: 2 row(s)








最后

以上就是震动戒指为你收集整理的HBase与hive集成的全部内容,希望文章能够帮你解决HBase与hive集成所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(41)

评论列表共有 0 条评论

立即
投稿
返回
顶部