Sqoop 工具是hadoop环境下连接关系数据库,和hadoop存储系统的桥梁,支持多种关系数据源和hive,hdfs,hbase的相互导入。一般情况下,关系数据表存在于线上环境的备份环境,需要每天进行数据导入,根据每天的数据量而言,sqoop可以全表导入,对于每天产生的数据量不是很大的情形可以全表导入,但是sqoop也提供了增量数据导入的机制。
下面介绍几个常用的sqoop的命令,以及一些参数:
| 序号 | 命令/command | 类 | 说明 |
| 1 | impor | ImportTool | 从关系型数据库中导入数据(来自表或者查询语句)到HDFS中 |
| 2 | export | ExportTool | 将HDFS中的数据导入到关系型数据库中 |
| 3 | codegen | CodeGenTool | 获取数据库中某张表数据生成Java并打成jar包 |
| 4 | create-hive-table | CreateHiveTableTool | 创建Hive表 |
| 5 | eval | EvalSqlTool | 查看SQL执行结果 |
| 6 | import-all-tables | ImportAllTablesTool | 导入某个数据库下所有表到HDFS中 |
| 7 | job | JobTool |
|
| 8 | list-databases | ListDatabasesTool | 列出所有数据库名 |
| 9 | list-tables | ListTablesTool | 列出某个数据库下所有表 |
| 10 | merge | MergeTool |
|
| 11 | metastore | MetastoreTool |
|
| 12 | help | HelpTool | 查看帮助 |
| 13 | version | VersionTool | 查看版本 |
接着列出Sqoop的各种通用参数,然后针对以上13个命令列出他们自己的参数.Sqoop通用参数又分
Common arguments
Incrementalimport arguments
Outputline formatting arguments
Inputparsing arguments,Hive arguments
HBasearguments
GenericHadoop command-line arguments
1.Common arguments通用参数,主要是针对关系型数据库链接的一些参数
| 序号 | 参数 | 说明 | 样例 |
| 1 | connect | 连接关系型数据库的URL | jdbc:mysql://localhost/sqoop_datas |
| 2 | connection-manager | 连接管理类,一般不用 |
|
| 3 | driver | 连接驱动 |
|
| 4 | hadoop-home | hadoop目录 | /home/hadoop |
| 5 | help | 查看帮助信息 |
|
| 6 | password | 连接关系型数据库的密码 |
|
| 7 | username | 链接关系型数据库的用户名 |
|
| 8 | verbose | 查看更多的信息,其实是将日志级别调低 | 该参数后面不接值 |
Importcontrol arguments:
| Argument | Description |
| --append | Append data to an existing dataset in HDFS |
| --as-avrodatafile | Imports data to Avro Data Files |
| --as-sequencefile | Imports data to SequenceFiles |
| --as-textfile | Imports data as plain text (default) |
| --boundary-query <statement> | Boundary query to use for creating splits |
| --columns <col,col,col…> | Columns to import from table |
| --direct | Use direct import fast path |
| --direct-split-size <n> | Split the input stream every n bytes when importing in direct mode |
| --inline-lob-limit <n> | Set the maximum size for an inline LOB |
| -m,--num-mappers <n> | Use n map tasks to import in parallel |
| -e,--query <statement> | Import the results of statement. |
| --split-by <column-name> | Column of the table used to split work units |
| --table <table-name> | Table to read |
| --target-dir <dir> | HDFS destination dir |
| --warehouse-dir <dir> | HDFS parent for table destination |
| --where <where clause> | WHERE clause to use during import |
| -z,--compress | Enable compression |
| --compression-codec <c> | Use Hadoop codec (default gzip) |
| --null-string <null-string> | The string to be written for a null value for string columns |
| --null-non-string <null-string> | The string to be written for a null value for non-string columns |
Incrementalimport arguments:
| Argument | Description |
| --check-column (col) | Specifies the column to be examined when determining which rows to import. |
| --incremental (mode) | Specifies how Sqoop determines which rows are new. Legal values for mode include append and lastmodified. |
| --last-value (value) | Specifies the maximum value of the check column from the previous import. |
Output lineformatting arguments:
| Argument | Description |
| --enclosed-by <char> | Sets a required field enclosing character |
| --escaped-by <char> | Sets the escape character |
| --fields-terminated-by <char> | Sets the field separator character |
| --lines-terminated-by <char> | Sets the end-of-line character |
| --mysql-delimiters | Uses MySQL’s default delimiter set: fields: , lines: n escaped-by: optionally-enclosed-by: ' |
| --optionally-enclosed-by <char> | Sets a field enclosing character |
Hivearguments:
| Argument | Description |
| --hive-home <dir> | Override $HIVE_HOME |
| --hive-import | Import tables into Hive (Uses Hive’s default delimiters if none are set.) |
| --hive-overwrite | Overwrite existing data in the Hive table. |
| --create-hive-table | If set, then the job will fail if the target hive |
|
| table exits. By default this property is false. |
| --hive-table <table-name> | Sets the table name to use when importing to Hive. |
| --hive-drop-import-delims | Drops n, r, and |