概述
问题导读:
1、Sqoop在Hadoop与关系型数据库之间传输数据,需要修改哪个配置文件?
2、需要将对应的关系型数据库JDBC驱动包拷贝到哪个目录下?
一、Sqoop1.4.4简介
Sqoop是一个在Hadoop与关系型数据库之间传输数据的工具。我们可以使用Sqoop将关系型数据库(如MySQL、Oracle等)中的数据导入到Hadoop的HDFS(Hadoop分布式文件系统)中,传输数据使用Hadoop的MapReduce并行计算机框架,也可以将数据从HDFS中导出到关系型数据库中。
现在Sqoop2.X已经出来,在安全性、并发性等方面比Sqoop1.X都要好,但是支持的功能有限,此处我们还是使用Sqoop1.X来学习使用Sqoop在关系型数据库与Hadoop之间进行数据的导入导出。下面介绍Sqoop1.4.4安装在Hadoop2.2.0集群上:
二、Sqoop1.4.4安装包下载解压
在Sqoop官网中下载Sqoop1.4.4安装包:sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz,在指定目录解压,如下:
[hadoopUser@secondmgt sqoop1.0]$ ls
sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz
[hadoopUser@secondmgt sqoop1.0]$ tar -zxvf sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz
三、配置环境变量
在用户家目录.bashrc中配置Sqoop1.4.4的环境变量,方便命令的执行
#Sqoop1.4.4 Configure
export SQOOP_HOME=/home/hadoopUser/cloud/sqoop1.0/sqoop-1.4.4.bin__hadoop-2.0.4-alpha
export PATH=$PATH:$SQOOP_HOME/bin
四、检查环境变量是否配置成功
[hadoopUser@secondmgt ~]$ sqoop help
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
usage: sqoop COMMAND [ARGS]
Available commands:
codegen
Generate code to interact with database records
create-hive-table
Import a table definition into Hive
eval
Evaluate a SQL statement and display the results
export
Export an HDFS directory to a database table
help
List available commands
import
Import a table from a database to HDFS
import-all-tables
Import tables from a database to HDFS
job
Work with saved jobs
list-databases
List available databases on a server
list-tables
List available tables in a database
merge
Merge results of incremental imports
metastore
Run a standalone Sqoop metastore
version
Display version information
See 'sqoop help COMMAND' for information on a specific command.
五、配置sqoop-env.sh
从sqoop-env-template.sh复制一份重命名为sqoop-env.sh文件。编辑里面内容
# Set Hadoop-specific environment variables here.
#Set path to where bin/hadoop is available
#export HADOOP_COMMON_HOME=
export HADOOP_COMMON_HOME=/home/hadoopUser/cloud/hadoop/programs/hadoop-2.2.0
#Set path to where hadoop-*-core.jar is available
#export HADOOP_MAPRED_HOME=/home/hadoopUser/cloud/hadoop/programs/hadoop-2.2.0/share/hadoop/mapreduce
#set the path to where bin/hbase is available
#export HBASE_HOME=
#Set the path to where bin/hive is available
#export HIVE_HOME=
#Set the path for where zookeper config dir is
#export ZOOCFGDIR=
HADOOP_COMMON_HOME:填写Hadoop的安装根目录
HADOOP_MAPRED_HOME:填写MapReduce的目录
HBASE_HOME:HBase的安装根目录。(此处暂时用不到可以先不填)
HIVE_HOME:Hive的安装根目录。(此处暂时用不到可以先不填)
六、将Mysql JDBC包拷贝到lib下
[hadoopUser@secondmgt lib]$ ls
ant-contrib-1.0b3.jar
avro-ipc-1.5.3.jar
hsqldb-1.8.0.10.jar
jopt-simple-3.2.jar
snappy-java-1.0.3.2.jar
ant-eclipse-1.0-jvm1.2.jar
avro-mapred-1.5.3.jar
jackson-core-asl-1.7.3.jar
mysql-connector-java-5.1.18-bin.jar
avro-1.5.3.jar
commons-io-1.4.jar
jackson-mapper-asl-1.7.3.jar
paranamer-2.3.jar
此处我们使用的是MySQL 数据库,如果是Oracle数据库,导入Oracle对应的JDBC包即可。
七、启动测试验证
Sqoop1.0无需启动即可使用,我们使用一条命令来查看是否配置正确,如下:
[hadoopUser@secondmgt ~]$ sqoop list-databases --connect jdbc:mysql://secondmgt:3306/ --password hive --username hive
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
15/01/17 20:07:39 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
15/01/17 20:07:39 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
information_schema
goodseval
hive
mysql
spice
sqoopdb
test
--connect jdbc:mysql://secondmgt:3306/ --password hive --username hive 是MySQL数据库的连接命令。
而我的MySQL中的数据库如下:
hadoopUser@secondmgt ~]$ mysql -uhive -phive
Welcome to the MySQL monitor.
Commands end with ; or g.
Your MySQL connection id is 410
Server version: 5.1.73 Source distribution
Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
mysql> show databases;
+--------------------+
| Database
|
+--------------------+
| information_schema |
| goodseval
|
| hive
|
| mysql
|
| spice
|
| sqoopdb
|
| test
|
+--------------------+
7 rows in set (0.00 sec)
由结果看,和Sqoop查询得到的结果一致,所以Sqoop安装成功。
推荐阅读:
下一篇:使用Sqoop1.4.4将MySQL数据库表中数据导入到HDFS中
最后
以上就是无语小蚂蚁为你收集整理的Sqoop1.4.4在Hadoop2.2.0集群上的安装的全部内容,希望文章能够帮你解决Sqoop1.4.4在Hadoop2.2.0集群上的安装所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复