概述
1、下载地址:http://mirrors.hust.edu.cn/apache/sqoop/1.4.6/
2、上传安装包sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz到虚拟机中
3、解压sqoop安装包到指定目录,并重命名 如:
tar -zxf sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz -C /home/hadoop/software/
4、修改配置文件
在sqoop根目录下的conf目录中:重命名配置文件
mv sqoop-env-template.sh sqoop-env.sh
vi /etc/sqoop-env.sh
修改如下配置:
export HADOOP_COMMON_HOME=/home/hadoop/software/hadoop-2.6.5
export HADOOP_MAPRED_HOME=/home/hadoop/software/hadoop-2.6.5
5、拷贝jdbc驱动到sqoop安装目录的lib目录下:
cp mysql-connector-java-5.1.27-bin.jar /home/hadoop/software/sqoop-1.4.6.bin__hadoop-2.0.4-alpha/lib/
6、添加环境变量
#sqoop
export SQOOP_HOME=/home/hadoop/software/sqoop-1.4.6.bin__hadoop-2.0.4-alpha
export PATH=$PATH:$SQOOP_HOME/bin
7、验证sqoop
bin/sqoop help
Available commands:
codegen
Generate code to interact with database records
create-hive-table
Import a table definition into Hive
eval
Evaluate a SQL statement and display the results
export
Export an HDFS directory to a database table
help
List available commands
import
Import a table from a database to HDFS
import-all-tables
Import tables from a database to HDFS
import-mainframe
Import datasets from a mainframe server to HDFS
job
Work with saved jobs
list-databases
List available databases on a server
list-tables
List available tables in a database
merge
Merge results of incremental imports
metastore
Run a standalone Sqoop metastore
version
Display version information
8、测试sqoop是否能成功连接数据库
bin/sqoop list-databases --connect jdbc:mysql://192.168.2.129:3306/ --username root --password 000000
出现如下输出:
information_schema
metastore
mysql
oozie
performance_schema
9、导入数据库
在Sqoop中,“导入”概念指:从非大数据集群(RDBMS)向大数据集群(HDFS,HIVE,HBASE)中传输数据,叫做:导入,即使用import关键字。我这里是导入HDFS中,首先启动hadoop集群环境。(如若启动不成功,会出现如下错误)
22/03/31 15:59:23 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
22/03/31 15:59:24 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
22/03/31 15:59:24 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
22/03/31 15:59:24 INFO tool.CodeGenTool: Beginning code generation
22/03/31 15:59:24 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `staff` AS t LIMIT 1
22/03/31 15:59:24 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `staff` AS t LIMIT 1
22/03/31 15:59:24 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hadoop/software/hadoop-2.6.5
注: /tmp/sqoop-root/compile/ec967c4570b35538a1d841eefdfda367/staff.java使用或覆盖了已过时的 API。
注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
22/03/31 15:59:25 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/ec967c4570b35538a1d841eefdfda367/staff.jar
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/software/hadoop-2.6.5/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/software/hbase-1.3.1/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
22/03/31 15:59:35 INFO retry.RetryInvocationHandler: Exception while invoking getFileInfo of class ClientNamenodeProtocolTranslatorPB over HAMaster002/192.168.2.130:9000 after 1 fail over attempts. Trying to fail over after sleeping for 1462ms.
java.net.ConnectException: Call From HAMaster001/192.168.2.129 to HAMaster002:9000 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
at org.apache.hadoop.ipc.Client.call(Client.java:1474)
at org.apache.hadoop.ipc.Client.call(Client.java:1401)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1977)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1118)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1400)
at org.apache.sqoop.tool.ImportTool.deleteTargetDir(ImportTool.java:552)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:516)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
Caused by: java.net.ConnectException: 拒绝连接
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1523)
at org.apache.hadoop.ipc.Client.call(Client.java:1440)
... 26 more
a、确定Mysql服务开启正常,在Mysql中建库建表并插入一些数据。
create database company;
create table company.staff(id int(4) primary key not null auto_increment, name varchar(255), sex varchar(255));
insert into company.staff(name, sex) values('zhangsan', 'Male');
insert into company.staff(name, sex) values('lisi', 'FeMale');
b、导入数据 全部导入
bin/sqoop import
--connect jdbc:mysql://192.168.2.129:3306/company
--username root
--password 123456
--table staff
--target-dir /home/hadoop/mysql-5.7.18/company
--delete-target-dir
--num-mappers 1
--fields-terminated-by "t"
c、查询导入
bin/sqoop import
--connect jdbc:mysql://192.168.2.129:3306/company
--username root
--password 123456
--target-dir /user/company
--delete-target-dir
--num-mappers 1
--fields-terminated-by "t"
--query 'select name,sex from staff where id <=1 and $CONDITIONS;'
d、导入指定列
bin/sqoop import
--connect jdbc:mysql://192.168.2.129:3306/company
--username root
--password 123456
--target-dir /user/company
--delete-target-dir
--num-mappers 1
--fields-terminated-by "t"
--columns id,sex
--table staff
最后
以上就是热心心情为你收集整理的Sqoop的安装与简单使用的全部内容,希望文章能够帮你解决Sqoop的安装与简单使用所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
发表评论 取消回复