USES sqoop-import to import data from mysql into HDFS because mysql table has no primary key
[walker001@walker001 ~]$ sqoop-import \
> --connect 'jdbc:mysql://192.168.220.129:3306/test?characterEncoding=UTF-8' \
> --username root \
> --password zwk95914 \
> --table user_info \
> --columns userId,userName,password,trueName,addedTime \
> --target-dir /sqoop/mysql
Warning: /home/walker001/app/sqoop-1.4.7.bin__hadoop-2.6.0/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /home/walker001/app/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /home/walker001/app/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /home/walker001/app/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
20/04/05 11:16:29 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
20/04/05 11:16:29 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
20/04/05 11:16:29 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
20/04/05 11:16:29 INFO tool.CodeGenTool: Beginning code generation
Sun Apr 05 11:16:30 CST 2020 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
20/04/05 11:16:31 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `user_info` AS t LIMIT 1
20/04/05 11:16:31 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `user_info` AS t LIMIT 1
20/04/05 11:16:31 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/walker001/app/hadoop-2.8.2
注: /tmp/sqoop-walker001/compile/e8a05a19dfbef5b687215bb6f631fbd2/user_info.java使用或覆盖了已过时的 API。
注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
20/04/05 11:16:40 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-walker001/compile/e8a05a19dfbef5b687215bb6f631fbd2/user_info.jar
20/04/05 11:16:40 WARN manager.MySQLManager: It looks like you are importing from mysql.
20/04/05 11:16:40 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
20/04/05 11:16:40 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
20/04/05 11:16:40 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
20/04/05 11:16:40 ERROR tool.ImportTool: Import failed: No primary key could be found for table user_info. Please specify one with --split-by or perform a sequential import with '-m 1'.
<人力资源>
mysql> alter table user_info add primary key(userId);
Query OK, 0 rows affected (0.04 sec)
Records: 0 Duplicates: 0 Warnings: 0
add primary key
[walker001@walker001 ~]$ sqoop-import --connect 'jdbc:mysql://192.168.220.129:3306/test?characterEncoding=UTF-8' --username root --password zwk95914 --table user_info --columns userId,userName,password,trueName,addedTime --target-dir /sqoop/mysql
Warning: /home/walker001/app/sqoop-1.4.7.bin__hadoop-2.6.0/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /home/walker001/app/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /home/walker001/app/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /home/walker001/app/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
20/04/05 11:19:46 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
20/04/05 11:19:47 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
20/04/05 11:19:47 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
20/04/05 11:19:47 INFO tool.CodeGenTool: Beginning code generation
Sun Apr 05 11:19:47 CST 2020 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
20/04/05 11:19:48 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `user_info` AS t LIMIT 1
20/04/05 11:19:48 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `user_info` AS t LIMIT 1
20/04/05 11:19:48 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/walker001/app/hadoop-2.8.2
注: /tmp/sqoop-walker001/compile/72faaf2287c7c39a8586ce10f0e78d74/user_info.java使用或覆盖了已过时的 API。
注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
20/04/05 11:19:52 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-walker001/compile/72faaf2287c7c39a8586ce10f0e78d74/user_info.jar
20/04/05 11:19:52 WARN manager.MySQLManager: It looks like you are importing from mysql.
20/04/05 11:19:52 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
20/04/05 11:19:52 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
20/04/05 11:19:52 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
20/04/05 11:19:52 INFO mapreduce.ImportJobBase: Beginning import of user_info
20/04/05 11:19:53 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
20/04/05 11:19:54 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
20/04/05 11:19:54 INFO client.RMProxy: Connecting to ResourceManager at walker001/192.168.220.129:8032
Sun Apr 05 11:20:06 CST 2020 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
20/04/05 11:20:07 INFO db.DBInputFormat: Using read commited transaction isolation
20/04/05 11:20:07 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`userId`), MAX(`userId`) FROM `user_info`
20/04/05 11:20:07 INFO db.IntegerSplitter: Split size: 0; Num splits: 4 from: 1 to: 2
20/04/05 11:20:07 INFO mapreduce.JobSubmitter: number of splits:2
20/04/05 11:20:08 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1586054209222_0001
20/04/05 11:20:09 INFO impl.YarnClientImpl: Submitted application application_1586054209222_0001
20/04/05 11:20:09 INFO mapreduce.Job: The url to track the job: http://walker001:8088/proxy/application_1586054209222_0001/
20/04/05 11:20:09 INFO mapreduce.Job: Running job: job_1586054209222_0001
20/04/05 11:20:33 INFO mapreduce.Job: Job job_1586054209222_0001 running in uber mode : false
20/04/05 11:20:33 INFO mapreduce.Job: map 0% reduce 0%
20/04/05 11:20:56 INFO mapreduce.Job: map 100% reduce 0%
20/04/05 11:20:58 INFO mapreduce.Job: Job job_1586054209222_0001 completed successfully
20/04/05 11:20:59 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=318188
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=213
HDFS: Number of bytes written=68
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=4
Job Counters
Launched map tasks=2
Other local map tasks=2
Total time spent by all maps in occupied slots (ms)=39020
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=39020
Total vcore-milliseconds taken by all map tasks=39020
Total megabyte-milliseconds taken by all map tasks=39956480
Map-Reduce Framework
Map input records=2
Map output records=2
Input split bytes=213
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=840
CPU time spent (ms)=2990
Physical memory (bytes) snapshot=204382208
Virtual memory (bytes) snapshot=3781640192
Total committed heap usage (bytes)=48259072
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=68
20/04/05 11:20:59 INFO mapreduce.ImportJobBase: Transferred 68 bytes in 65.254 seconds (1.0421 bytes/sec)
20/04/05 11:20:59 INFO mapreduce.ImportJobBase: Retrieved 2 records.
[walker001@walker001 ~]$ hadoop fs -ls /sqoop/mysql
Found 3 items
-rw-r--r-- 2 walker001 supergroup 0 2020-04-05 11:20 /sqoop/mysql/_SUCCESS
-rw-r--r-- 2 walker001 supergroup 35 2020-04-05 11:20 /sqoop/mysql/part-m-00000
-rw-r--r-- 2 walker001 supergroup 33 2020-04-05 11:20 /sqoop/mysql/part-m-00001
[walker001@walker001 ~]$ hadoop fs -cat /sqoop/mysql/*
1,hello,123456,zhangsan,2017-09-01
2,hello2,123456,lisis,2019-09-01
after re-import