概述
官方下载页链接:http://www.apache.org/dyn/clo...
这里用的sqoop版本为1.X
下载路径:http://mirrors.hust.edu.cn/ap...
# 下载
wget http://mirrors.hust.edu.cn/apache/sqoop/1.4.7/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz
# 解压
tar -zxvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz -C /usr/local
# 配置文件模板复制
cp /usr/local/sqoop/conf/sqoop-env-template.sh /usr/local/sqoop/conf/sqoop-env.sh
cp /usr/local/sqoop/conf/sqoop-site-template.xml /usr/local/sqoop/conf/sqoop-site.xml
ln -s /usr/local/hive/conf/hive-site.conf /usr/local/sqoop/conf/hive-site.conf
配置文件介绍
sqoop-env.sh
# 设置hive家目录
export HIVE_HOME=/usr/local/hive
添加环境变量
/etc/profile
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/usr/local/hive/lib/*
添加mysql驱动
将mysql连接器驱动放入sqoop安装目录下的lib目录中,这里要注意别用最新的驱动
博主这里用的是5.1
驱动下载页面链接:https://dev.mysql.com/downloa...
测试
/usr/local/sqoop/bin/sqoop
list-databases
--connect jdbc:mysql://hadoop001:3306/
--username root
--password 123456
mysql导入hive
# mysql全表导入hive
bin/sqoop import
--driver com.mysql.jdbc.Driver
--connect jdbc:mysql://hadoop001:3306/hadoop
--username root
--password 123456
--table test
--fields-terminated-by ' 01'
--lines-terminated-by 'n'
--delete-target-dir
--num-mappers 1
--hive-import
--hive-database default
--hive-table test
--direct
# mysql导入hive增量更新
bin/sqoop import
--driver com.mysql.jdbc.Driver
--connect jdbc:mysql://hadoop001:3306/hadoop
--username root
--password 123456
--table test
--check-column time
--incremental lastmodified
--last-value '2018-08-09 15:30:29'
--merge-key id
--fields-terminated-by ' 01'
--lines-terminated-by 'n'
--num-mappers 1
--target-dir /user/hive/warehouse/test
--hive-drop-import-delims # --hive-delims-replacement '-'
导入数据过程中可能报错
==main ERROR Could not register mbeans java.security.AccessControlException: access denied ("javax.management.MBeanTrustPermission" "register") ==
这个是java的安全策略问题;找到jre包,/java/jre/lib/security,在这个包下面有个java.policy文件,打开编辑它: 添加如下permission javax.management.MBeanTrustPermission "register";
job机制
# 添加一个增量更新job
bin/sqoop job --create test --
import
--driver com.mysql.jdbc.Driver
--connect jdbc:mysql://hadoop001:3306/hadoop
--username root
--password 123456
--table test
--check-column time
--incremental lastmodified
--last-value '2018-08-09 15:30:29'
--merge-key id
--fields-terminated-by ' 01'
--lines-terminated-by 'n'
--num-mappers 1
--target-dir /user/hive/warehouse/test
添加过程可能报错
以下为报错内容
ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
java.lang.NullPointerException
at org.json.JSONObject.(JSONObject.java:144)
at org.apache.sqoop.util.SqoopJsonUtil.getJsonStringforMap(SqoopJsonUtil.java:43)
at org.apache.sqoop.SqoopOptions.writeProperties(SqoopOptions.java:785)
at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.createInternal(HsqldbJobStorage.java:399)
at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.create(HsqldbJobStorage.java:379)
at org.apache.sqoop.tool.JobTool.createJob(JobTool.java:181)
at org.apache.sqoop.tool.JobTool.run(JobTool.java:294)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
报错原因是因为缺少json包:==java-json-schema.jar==
将jar包添加到${SQOOP_HOME}/lib下即可
执行job
bin/sqoop job --exec test
执行job过程中会提示输入数据库密码,可以在sqoop-site.xml中添加下面的相关配置,添加后重新创建job即可实现免密码
sqoop.metastore.client.record.password
true
If true, allow saved passwords in the metastore.
再次执行job后查看数据已被更新
查看job
bin/sqoop job --show test
Job: test
Tool: import
Options:
----------------------------
verbose = false
hcatalog.drop.and.create.table = false
# sqoop会自动帮你记录last-value并更新,这使得增量更新变得相当简便
incremental.last.value = 2018-08-10 03:51:47.0
db.connect.string = jdbc:mysql://hadoop001:3306/hadoop
codegen.output.delimiters.escape = 0
codegen.output.delimiters.enclose.required = false
codegen.input.delimiters.field = 0
mainframe.input.dataset.type = p
split.limit = null
hbase.create.table = false
db.require.password = false
skip.dist.cache = false
hdfs.append.dir = false
db.table = test
codegen.input.delimiters.escape = 0
db.password = 123456
accumulo.create.table = false
import.fetch.size = null
codegen.input.delimiters.enclose.required = false
db.username = root
reset.onemapper = false
codegen.output.delimiters.record = 10
import.max.inline.lob.size = 16777216
sqoop.throwOnError = false
hbase.bulk.load.enabled = false
hcatalog.create.table = false
db.clear.staging.table = false
incremental.col = time
codegen.input.delimiters.record = 0
enable.compression = false
hive.overwrite.table = false
hive.import = false
codegen.input.delimiters.enclose = 0
accumulo.batch.size = 10240000
hive.drop.delims = false
customtool.options.jsonmap = {}
codegen.output.delimiters.enclose = 0
hdfs.delete-target.dir = false
codegen.output.dir = .
codegen.auto.compile.dir = true
relaxed.isolation = false
mapreduce.num.mappers = 1
accumulo.max.latency = 5000
import.direct.split.size = 0
sqlconnection.metadata.transaction.isolation.level = 2
codegen.output.delimiters.field = 9
export.new.update = UpdateOnly
incremental.mode = DateLastModified
hdfs.file.format = TextFile
sqoop.oracle.escaping.disabled = true
codegen.compile.dir = /tmp/sqoop-hadoop/compile/028365970856b88aa0aa91435ff172e5
direct.import = false
temporary.dirRoot = _sqoop
hdfs.target.dir = /user/hive/warehouse/test
hive.fail.table.exists = false
merge.key.col = id
jdbc.driver.class = com.mysql.jdbc.Driver
db.batch = false
==通常情况下,我们可以结合sqoop job和crontab等任务调度工具实现相关业务==
hive导出到mysql
bin/sqoop export
--driver com.mysql.jdbc.Driver
--connect "jdbc:mysql://hadoop001:3306/hadoop?useUnicode=true&characterEncoding=utf-8"
--username root
--password 123456
--table test_out
--num-mappers 1
--export-dir /user/hive/warehouse/test_out
--fields-terminated-by ' 01'
--lines-terminated-by 'n'
最后
以上就是野性小刺猬为你收集整理的sqoop在Linux中环境搭建,sqoop的安装与使用的全部内容,希望文章能够帮你解决sqoop在Linux中环境搭建,sqoop的安装与使用所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复