1.Sqoop中常用的命令
Sqoop中常用的命令有下面这些:

这些命令的用法请参考下面的实战案例。
2.Sqoop实战案例
环境准备:
(1)启动Hadoop集群
[root@bigdata ~]# start-all.sh
[root@bigdata ~]# jps
2656 NodeManager
3001 Jps
2378 SecondaryNameNode
2172 DataNode
2062 NameNode
2542 ResourceManager
(2)MySQL中的tblEmp表的数据:
mysql> select* from tblEmp;
+-------+--------+-----------+------+------------+------+------+--------+
| empno | ename | job | mgr | hiredate | sal | comm | deptno |
+-------+--------+-----------+------+------------+------+------+--------+
| 7369 | SMITH | CLERK | 7902 | 1980/12/17 | 800 | 0 | 20 |
| 7499 | ALLEN | SALESMAN | 7698 | 1981/2/20 | 1600 | 300 | 30 |
| 7521 | WARD | SALESMAN | 7698 | 1981/2/22 | 1250 | 500 | 30 |
| 7566 | JONES | MANAGER | 7839 | 1981/4/2 | 2975 | 0 | 20 |
| 7654 | MARTIN | SALESMAN | 7698 | 1981/9/28 | 1250 | 1400 | 30 |
| 7698 | BLAKE | MANAGER | 7839 | 1981/5/1 | 2850 | 0 | 30 |
| 7782 | CLARK | MANAGER | 7839 | 1981/6/9 | 2450 | 0 | 10 |
| 7788 | SCOTT | ANALYST | 7566 | 1987/4/19 | 3000 | 0 | 20 |
| 7839 | KING | PRESIDENT | 0 | 1981/11/17 | 5000 | 0 | 10 |
| 7844 | TURNER | SALESMAN | 7698 | 1981/9/8 | 1500 | 0 | 30 |
| 7876 | ADAMS | CLERK | 7788 | 1987/5/23 | 1100 | 0 | 20 |
| 7900 | JAMES | CLERK | 7698 | 1981/12/3 | 950 | 0 | 30 |
| 7902 | FORD | ANALYST | 7566 | 1981/12/3 | 3000 | 0 | 20 |
| 7934 | MILLER | CLERK | 7782 | 1982/1/23 | 1300 | 0 | 10 |
+-------+--------+-----------+------+------------+------+------+--------+
14 rows in set (0.00 sec)
以下案例均已MySQL数据库为操作对象,需要将MySQL的JDBC驱动包放到$SQOOP_HOME/lib目录下面。
例1:生成MySQL中表tblEmp的Java类代码:
[root@bigdata ~]# sqoop codegen \
--connect jdbc:mysql://bigdata:3306/mysql \
--username root \
--password 123456 \
--table tblEmp \
--bindir /sqoop/tblEmp
18/09/29 23:09:29 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
18/09/29 23:09:29 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/09/29 23:09:29 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
18/09/29 23:09:29 INFO tool.CodeGenTool: Beginning code generation
18/09/29 23:09:29 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM tblEmp AS t LIMIT 1
18/09/29 23:09:29 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM tblEmp AS t LIMIT 1
18/09/29 23:09:29 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /root/trainings/hadoop-2.7.3
注: /sqoop/tblEmp/tblEmp.java使用或覆盖了已过时的 API。
注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
18/09/29 23:09:31 INFO orm.CompilationManager: Writing jar file: /sqoop/tblEmp/tblEmp.jar
[root@bigdata ~]# ls /sqoop/tblEmp
tblEmp$1.class tblEmp$3.class tblEmp$5.class tblEmp$7.class tblEmp.class tblEmp.jar
tblEmp$2.class tblEmp$4.class tblEmp$6.class tblEmp$8.class tblEmp$FieldSetterCommand.class tblEmp.java
例2:生成与MySQL中表tblEmp结构相对应的Hive表empHive(empHive不需要事先创建,注意和HBase的区别):
[root@bigdata ~]# sqoop create-hive-table \
--connect jdbc:mysql://bigdata:3306/mysql \
--username root \
--password 123456 \
--table tblEmp \
--hive-table empHive
例3:查询MySQL中表tblEmp中10号部门的员工姓名:
[root@bigdata ~]# sqoop eval \
--connect jdbc:mysql://bigdata:3306/mysql \
--username root \
--password 123456 \
--query "select ename from tblEmp where deptno=10"
18/09/29 23:31:34 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
18/09/29 23:31:34 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/09/29 23:31:34 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/trainings/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/trainings/apache-hive-3.1.0-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
------------------------
| ename |
------------------------
| CLARK |
| KING |
| MILLER |
例4:将HDFS上的/Students目录下的数据导出到MySQL中表tblStudents中:
[root@bigdata ~]# sqoop export \
--connect jdbc:mysql://bigdata:3306/mysql \
--username root \
--password 123456 \
--table tblStudents \
--export-dir /Students
例5:将MySQL中表tblEmp导入到HDFS上/Emp目录下:
[root@bigdata ~]# sqoop import \
--connect jdbc:mysql://bigdata:3306/mysql \
--username root \
--password 123456 \
--table tblEmp \
--target-dir /Emp \
-m 1
[root@bigdata ~]# hdfs dfs -ls /Emp
Found 2 items
-rw-r--r-- 1 root supergroup 0 2018-09-29 23:34 /Emp/_SUCCESS
-rw-r--r-- 1 root supergroup 614 2018-09-29 23:34 /Emp/part-m-00000
例6:将MySQL中所有表的数据导入到HDFS上:
mysql> use mysql;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -ADatabase changed
mysql> show tables;
+---------------------------+
| Tables_in_mysql |
+---------------------------+
| columns_priv |
| db |
| event |
| func |
| general_log |
| help_category |
| help_keyword |
| help_relation |
| help_topic |
| innodb_index_stats |
| innodb_table_stats |
| ndb_binlog_index |
| plugin |
| proc |
| procs_priv |
| proxies_priv |
| servers |
| slave_master_info |
| slave_relay_log_info |
| slave_worker_info |
| slow_log |
| tables_priv |
| tblEmp |
| time_zone |
| time_zone_leap_second |
| time_zone_name |
| time_zone_transition |
| time_zone_transition_type |
| user |
+---------------------------+
29 rows in set (0.00 sec)
[root@bigdata ~]# sqoop import-all-tables \
--connect jdbc:mysql://bigdata:3306/mysql \
--username root \
--password 123456 \
-m 1
[root@bigdata ~]# hdfs dfs -ls /user/root
Found 29 items
drwxr-xr-x - root supergroup 0 2018-09-29 23:36 /user/root/columns_priv
drwxr-xr-x - root supergroup 0 2018-09-29 23:37 /user/root/db
drwxr-xr-x - root supergroup 0 2018-09-29 23:37 /user/root/event
drwxr-xr-x - root supergroup 0 2018-09-29 23:37 /user/root/func
drwxr-xr-x - root supergroup 0 2018-09-29 23:38 /user/root/general_log
drwxr-xr-x - root supergroup 0 2018-09-29 23:38 /user/root/help_category
drwxr-xr-x - root supergroup 0 2018-09-29 23:38 /user/root/help_keyword
drwxr-xr-x - root supergroup 0 2018-09-29 23:38 /user/root/help_relation
drwxr-xr-x - root supergroup 0 2018-09-29 23:39 /user/root/help_topic
drwxr-xr-x - root supergroup 0 2018-09-29 23:39 /user/root/innodb_index_stats
drwxr-xr-x - root supergroup 0 2018-09-29 23:39 /user/root/innodb_table_stats
drwxr-xr-x - root supergroup 0 2018-09-29 23:40 /user/root/ndb_binlog_index
drwxr-xr-x - root supergroup 0 2018-09-29 23:40 /user/root/plugin
drwxr-xr-x - root supergroup 0 2018-09-29 23:40 /user/root/proc
drwxr-xr-x - root supergroup 0 2018-09-29 23:40 /user/root/procs_priv
drwxr-xr-x - root supergroup 0 2018-09-29 23:41 /user/root/proxies_priv
drwxr-xr-x - root supergroup 0 2018-09-29 23:41 /user/root/servers
drwxr-xr-x - root supergroup 0 2018-09-29 23:41 /user/root/slave_master_info
drwxr-xr-x - root supergroup 0 2018-09-29 23:41 /user/root/slave_relay_log_info
drwxr-xr-x - root supergroup 0 2018-09-29 23:42 /user/root/slave_worker_info
drwxr-xr-x - root supergroup 0 2018-09-29 23:42 /user/root/slow_log
drwxr-xr-x - root supergroup 0 2018-09-29 23:42 /user/root/tables_priv
drwxr-xr-x - root supergroup 0 2018-09-29 23:42 /user/root/tblEmp
drwxr-xr-x - root supergroup 0 2018-09-29 23:43 /user/root/time_zone
drwxr-xr-x - root supergroup 0 2018-09-29 23:43 /user/root/time_zone_leap_second
drwxr-xr-x - root supergroup 0 2018-09-29 23:43 /user/root/time_zone_name
drwxr-xr-x - root supergroup 0 2018-09-29 23:43 /user/root/time_zone_transition
drwxr-xr-x - root supergroup 0 2018-09-29 23:44 /user/root/time_zone_transition_type
drwxr-xr-x - root supergroup 0 2018-09-29 23:44 /user/root/user
例7:列出MySQL中root用户的所有数据库:
[root@bigdata ~]# sqoop list-databases \
--connect jdbc:mysql://bigdata:3306 \
--username root \
--password 123456
information_schema
hive
mysql
performance_schema
spark
例8:列出MySQL中root用户下所有表:
[root@bigdata ~]# sqoop list-tables \
--connect jdbc:mysql://bigdata:3306/mysql \
--username root \
--password 123456
columns_priv
db
event
func
general_log
help_category
help_keyword
help_relation
help_topic
innodb_index_stats
innodb_table_stats
ndb_binlog_index
plugin
proc
procs_priv
proxies_priv
servers
slave_master_info
slave_relay_log_info
slave_worker_info
slow_log
tables_priv
tblEmp
time_zone
time_zone_leap_second
time_zone_name
time_zone_transition
time_zone_transition_type
user
例9:将MySQL中表tblEmp的数据导入到HBase表empHBase(empHBase需事先创建):
hbase(main):013:0> create 'empHBase','empinfo'
Created table empHBase
Took 0.8552 seconds
=> Hbase::Table - empHBase
hbase(main):014:0> scan 'empHBase'
ROW COLUMN+CELL
0 row(s)
Took 0.0227 seconds
[root@bigdata ~]# sqoop import \
--connect jdbc:mysql://bigdata:3306/mysql \
--username root \
--password 123456 \
--table tblEmp \
--columns empno,ename,sal,deptno \
--hbase-table empHBase \
--hbase-row-key empno \
--column-family empinfo \
-m 1
例10:查看Sqoop版本
[root@bigdata ~]# sqoop version
Warning: /root/trainings/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /root/trainings/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/09/29 23:54:59 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
Sqoop 1.4.7
git commit id 2328971411f57f0cb683dfb79d19d4d19d185dd8
Compiled by maugli on Thu Dec 21 15:59:58 STD 2017
例11:查看Sqoop帮助信息
[root@bigdata ~]# sqoop help
Warning: /root/trainings/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /root/trainings/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/09/29 23:55:20 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
usage: sqoop COMMAND [ARGS]Available commands:
codegen Generate code to interact with database records
create-hive-table Import a table definition into Hive
eval Evaluate a SQL statement and display the results
export Export an HDFS directory to a database table
help List available commands
import Import a table from a database to HDFS
import-all-tables Import tables from a database to HDFS
import-mainframe Import datasets from a mainframe server to HDFS
job Work with saved jobs
list-databases List available databases on a server
list-tables List available tables in a database
merge Merge results of incremental imports
metastore Run a standalone Sqoop metastore
version Display version informationSee 'sqoop help COMMAND' for information on a specific command.</pre>
例11:查看Sqoop命令的帮助信息
[root@bigdata ~]# sqoop command --help