我是靠谱客的博主 矮小柜子,最近开发中收集的这篇文章主要介绍flume将数据导入到hbase中,觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

文章目录

      • 一、安装flume-ng-1.6.0-cdh5.5.2
        • 1.准备:
        • 2.修改 flume-env.sh 配置文件:
        • 3.验证是否安装成功:
      • 二、往hbase中导入数据
        • 1.准备:
        • 2.创建配置文件:
        • 3.启动flume agent:
        • 4.产生数据:

一、安装flume-ng-1.6.0-cdh5.5.2

1.准备:

压缩包下载地址:http://archive.cloudera.com/cdh5/cdh/5/flume-ng-1.6.0-cdh5.5.2.tar.gz

[hadoop@h71 ~]$ tar -zxvf flume-ng-1.6.0-cdh5.5.2.tar.gz

2.修改 flume-env.sh 配置文件:

主要是JAVA_HOME变量设置:

[hadoop@h71 apache-flume-1.6.0-cdh5.5.2-bin]$ cp conf/flume-env.sh.template conf/flume-env.sh
[hadoop@h71 apache-flume-1.6.0-cdh5.5.2-bin]$ vi conf/flume-env.sh
# 添加:
export JAVA_HOME=/usr/jdk1.7.0_25

注:这里添加的是你的Java安装目录,我这里安装的是jdk1.7.0_25,注意Java版本不要过低。如果不知道java的安装路径的话,可以在有java环境下执行echo $JAVA_HOME查看。

3.验证是否安装成功:

[hadoop@h71 apache-flume-1.6.0-cdh5.5.2-bin]$ bin/flume-ng version
Flume 1.6.0-cdh5.5.2
Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
Revision: f65a722cd2d1e7ceeda972570a5a6ee01c3a0a3d
Compiled by jenkins on Mon Jan 25 16:38:11 PST 2016
From source with checksum 028a2c6b035a03df1dfa91a3feda3424

二、往hbase中导入数据

1.准备:

[hadoop@h71 ~]$ cd hbase-1.0.0-cdh5.5.2/lib/

# 然后将以下文件复制到flume中:
[hadoop@h71 lib]$ cp protobuf-java-2.5.0.jar /home/hadoop/apache-flume-1.6.0-cdh5.5.2-bin/lib/
[hadoop@h71 lib]$ cp hbase-protocol-1.0.0-cdh5.5.2.jar /home/hadoop/apache-flume-1.6.0-cdh5.5.2-bin/lib/
[hadoop@h71 lib]$ cp hbase-client-1.0.0-cdh5.5.2.jar /home/hadoop/apache-flume-1.6.0-cdh5.5.2-bin/lib/
[hadoop@h71 lib]$ cp hbase-common-1.0.0-cdh5.5.2.jar /home/hadoop/apache-flume-1.6.0-cdh5.5.2-bin/lib/
[hadoop@h71 lib]$ cp hbase-server-1.0.0-cdh5.5.2.jar /home/hadoop/apache-flume-1.6.0-cdh5.5.2-bin/lib/
[hadoop@h71 lib]$ cp hbase-hadoop2-compat-1.0.0-cdh5.5.2.jar /home/hadoop/apache-flume-1.6.0-cdh5.5.2-bin/lib/
[hadoop@h71 lib]$ cp hbase-hadoop-compat-1.0.0-cdh5.5.2.jar /home/hadoop/apache-flume-1.6.0-cdh5.5.2-bin/lib/
[hadoop@h71 lib]$ cp htrace-core-3.2.0-incubating.jar /home/hadoop/apache-flume-1.6.0-cdh5.5.2-bin/lib/
# 注:也可以直接把hbase-1.0.0-cdh5.5.2/lib下的jar包全部复制到flume的lib目录下

确保test_idoall_org表在hbase中已经存在:

hbase(main):002:0> create 'test_idoall_org','uid','name'
0 row(s) in 0.6730 seconds

=> Hbase::Table - test_idoall_org
hbase(main):003:0> put 'test_idoall_org','10086','name:idoall','idoallvalue'
0 row(s) in 0.0960 seconds

2.创建配置文件:

[hadoop@h71 apache-flume-1.6.0-cdh5.5.2-bin]$ vi conf/hbase_simple.conf
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /home/hadoop/data.txt
a1.sources.r1.port = 44444
a1.sources.r1.host = 192.168.8.71
a1.sources.r1.channels = c1

# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k1.type = hbase
a1.sinks.k1.table = test_idoall_org
a1.sinks.k1.columnFamily = name
a1.sinks.k1.serializer = org.apache.flume.sink.hbase.RegexHbaseEventSerializer
a1.sinks.k1.channel = memoryChannel

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

3.启动flume agent:

[hadoop@h71 apache-flume-1.6.0-cdh5.5.2-bin]$ bin/flume-ng agent -c . -f conf/hbase_simple.conf -n a1 -Dflume.root.logger=INFO,console
。。。。。。。。。(前面省略,太多了)
:/home/hadoop/hive-1.1.0-cdh5.5.2/lib/logredactor-1.0.3.jar:/home/hadoop/hive-1.1.0-cdh5.5.2/lib/commons-dbcp-1.4.jar:/home/hadoop/hive-1.1.0-cdh5.5.2/lib/jcommander-1.32.jar
12/12/13 00:21:08 INFO zookeeper.ZooKeeper: Client environment:java.library.path=:/home/hadoop/hadoop-2.6.0-cdh5.5.2/lib/native
12/12/13 00:21:08 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
12/12/13 00:21:08 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
12/12/13 00:21:08 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
12/12/13 00:21:08 INFO zookeeper.ZooKeeper: Client environment:os.arch=i386
12/12/13 00:21:08 INFO zookeeper.ZooKeeper: Client environment:os.version=2.6.18-194.el5
12/12/13 00:21:08 INFO zookeeper.ZooKeeper: Client environment:user.name=hadoop
12/12/13 00:21:08 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/hadoop
12/12/13 00:21:08 INFO zookeeper.ZooKeeper: Client environment:user.dir=/home/hadoop/apache-flume-1.6.0-cdh5.5.2-bin
12/12/13 00:21:08 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=90000 watcher=hconnection-0x1d630160x0, quorum=localhost:2181, baseZNode=/hbase
12/12/13 00:21:09 INFO zookeeper.ClientCnxn: Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
12/12/13 00:21:09 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: /127.0.0.1:47668, server: 127.0.0.1/127.0.0.1:2181
12/12/13 00:21:09 INFO zookeeper.ClientCnxn: Session establishment complete on server 127.0.0.1/127.0.0.1:2181, sessionid = 0x3b8fd59e830000, negotiated timeout = 90000
12/12/13 00:21:10 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
12/12/13 00:21:10 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SINK, name: k1: Successfully registered new MBean.
12/12/13 00:21:10 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: k1 started
(启动成功。。。)

4.产生数据:

[hadoop@h71 ~]$ touch data.txt
[hadoop@h71 ~]$ echo "hello idoall.org from flume" >> data.txt

这时登录到hbase中,可以发现新数据已经插入:

hbase(main):005:0> scan 'test_idoall_org'
ROW                                                          COLUMN+CELL                                                                                                                                                                     
 10086                                                       column=name:idoall, timestamp=1355329032253, value=idoallvalue                                                                                                                  
 1355329550628-0EZpfeEvxG-0                                  column=name:payload, timestamp=1355329383396, value=hello idoall.org from flume                                                                                                 
2 row(s) in 0.0140 seconds

最后

以上就是矮小柜子为你收集整理的flume将数据导入到hbase中的全部内容,希望文章能够帮你解决flume将数据导入到hbase中所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(57)

评论列表共有 0 条评论

立即
投稿
返回
顶部