我是靠谱客的博主 野性小刺猬,最近开发中收集的这篇文章主要介绍sqoop在Linux中环境搭建,sqoop的安装与使用,觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

官方下载页链接:http://www.apache.org/dyn/clo...

这里用的sqoop版本为1.X

下载路径:http://mirrors.hust.edu.cn/ap...

# 下载

wget http://mirrors.hust.edu.cn/apache/sqoop/1.4.7/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz

# 解压

tar -zxvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz -C /usr/local

# 配置文件模板复制

cp /usr/local/sqoop/conf/sqoop-env-template.sh /usr/local/sqoop/conf/sqoop-env.sh

cp /usr/local/sqoop/conf/sqoop-site-template.xml /usr/local/sqoop/conf/sqoop-site.xml

ln -s /usr/local/hive/conf/hive-site.conf /usr/local/sqoop/conf/hive-site.conf

配置文件介绍

sqoop-env.sh

# 设置hive家目录

export HIVE_HOME=/usr/local/hive

添加环境变量

/etc/profile

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/usr/local/hive/lib/*

添加mysql驱动

将mysql连接器驱动放入sqoop安装目录下的lib目录中,这里要注意别用最新的驱动

博主这里用的是5.1

驱动下载页面链接:https://dev.mysql.com/downloa...

测试

/usr/local/sqoop/bin/sqoop

list-databases

--connect jdbc:mysql://hadoop001:3306/

--username root

--password 123456

mysql导入hive

# mysql全表导入hive

bin/sqoop import

--driver com.mysql.jdbc.Driver

--connect jdbc:mysql://hadoop001:3306/hadoop

--username root

--password 123456

--table test

--fields-terminated-by ' 01'

--lines-terminated-by 'n'

--delete-target-dir

--num-mappers 1

--hive-import

--hive-database default

--hive-table test

--direct

# mysql导入hive增量更新

bin/sqoop import

--driver com.mysql.jdbc.Driver

--connect jdbc:mysql://hadoop001:3306/hadoop

--username root

--password 123456

--table test

--check-column time

--incremental lastmodified

--last-value '2018-08-09 15:30:29'

--merge-key id

--fields-terminated-by ' 01'

--lines-terminated-by 'n'

--num-mappers 1

--target-dir /user/hive/warehouse/test

--hive-drop-import-delims # --hive-delims-replacement '-'

导入数据过程中可能报错

==main ERROR Could not register mbeans java.security.AccessControlException: access denied ("javax.management.MBeanTrustPermission" "register") ==

这个是java的安全策略问题;找到jre包,/java/jre/lib/security,在这个包下面有个java.policy文件,打开编辑它: 添加如下permission javax.management.MBeanTrustPermission "register";

job机制

# 添加一个增量更新job

bin/sqoop job --create test --

import

--driver com.mysql.jdbc.Driver

--connect jdbc:mysql://hadoop001:3306/hadoop

--username root

--password 123456

--table test

--check-column time

--incremental lastmodified

--last-value '2018-08-09 15:30:29'

--merge-key id

--fields-terminated-by ' 01'

--lines-terminated-by 'n'

--num-mappers 1

--target-dir /user/hive/warehouse/test

添加过程可能报错

以下为报错内容

ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException

java.lang.NullPointerException

at org.json.JSONObject.(JSONObject.java:144)

at org.apache.sqoop.util.SqoopJsonUtil.getJsonStringforMap(SqoopJsonUtil.java:43)

at org.apache.sqoop.SqoopOptions.writeProperties(SqoopOptions.java:785)

at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.createInternal(HsqldbJobStorage.java:399)

at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.create(HsqldbJobStorage.java:379)

at org.apache.sqoop.tool.JobTool.createJob(JobTool.java:181)

at org.apache.sqoop.tool.JobTool.run(JobTool.java:294)

at org.apache.sqoop.Sqoop.run(Sqoop.java:147)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)

at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)

at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)

at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)

at org.apache.sqoop.Sqoop.main(Sqoop.java:252)

报错原因是因为缺少json包:==java-json-schema.jar==

将jar包添加到${SQOOP_HOME}/lib下即可

执行job

bin/sqoop job --exec test

执行job过程中会提示输入数据库密码,可以在sqoop-site.xml中添加下面的相关配置,添加后重新创建job即可实现免密码

sqoop.metastore.client.record.password

true

If true, allow saved passwords in the metastore.

再次执行job后查看数据已被更新

查看job

bin/sqoop job --show test

Job: test

Tool: import

Options:

----------------------------

verbose = false

hcatalog.drop.and.create.table = false

# sqoop会自动帮你记录last-value并更新,这使得增量更新变得相当简便

incremental.last.value = 2018-08-10 03:51:47.0

db.connect.string = jdbc:mysql://hadoop001:3306/hadoop

codegen.output.delimiters.escape = 0

codegen.output.delimiters.enclose.required = false

codegen.input.delimiters.field = 0

mainframe.input.dataset.type = p

split.limit = null

hbase.create.table = false

db.require.password = false

skip.dist.cache = false

hdfs.append.dir = false

db.table = test

codegen.input.delimiters.escape = 0

db.password = 123456

accumulo.create.table = false

import.fetch.size = null

codegen.input.delimiters.enclose.required = false

db.username = root

reset.onemapper = false

codegen.output.delimiters.record = 10

import.max.inline.lob.size = 16777216

sqoop.throwOnError = false

hbase.bulk.load.enabled = false

hcatalog.create.table = false

db.clear.staging.table = false

incremental.col = time

codegen.input.delimiters.record = 0

enable.compression = false

hive.overwrite.table = false

hive.import = false

codegen.input.delimiters.enclose = 0

accumulo.batch.size = 10240000

hive.drop.delims = false

customtool.options.jsonmap = {}

codegen.output.delimiters.enclose = 0

hdfs.delete-target.dir = false

codegen.output.dir = .

codegen.auto.compile.dir = true

relaxed.isolation = false

mapreduce.num.mappers = 1

accumulo.max.latency = 5000

import.direct.split.size = 0

sqlconnection.metadata.transaction.isolation.level = 2

codegen.output.delimiters.field = 9

export.new.update = UpdateOnly

incremental.mode = DateLastModified

hdfs.file.format = TextFile

sqoop.oracle.escaping.disabled = true

codegen.compile.dir = /tmp/sqoop-hadoop/compile/028365970856b88aa0aa91435ff172e5

direct.import = false

temporary.dirRoot = _sqoop

hdfs.target.dir = /user/hive/warehouse/test

hive.fail.table.exists = false

merge.key.col = id

jdbc.driver.class = com.mysql.jdbc.Driver

db.batch = false

==通常情况下,我们可以结合sqoop job和crontab等任务调度工具实现相关业务==

hive导出到mysql

bin/sqoop export

--driver com.mysql.jdbc.Driver

--connect "jdbc:mysql://hadoop001:3306/hadoop?useUnicode=true&characterEncoding=utf-8"

--username root

--password 123456

--table test_out

--num-mappers 1

--export-dir /user/hive/warehouse/test_out

--fields-terminated-by ' 01'

--lines-terminated-by 'n'

最后

以上就是野性小刺猬为你收集整理的sqoop在Linux中环境搭建,sqoop的安装与使用的全部内容,希望文章能够帮你解决sqoop在Linux中环境搭建,sqoop的安装与使用所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(72)

评论列表共有 0 条评论

立即
投稿
返回
顶部