安装Sqoop
# 下载上传解压重命名
# 修改配置文件sqoop-env.sh
进行Sqoop安装目录中的conf目录
`[root@cloud-1 ~]# cd /opt/sqoop/conf/`
拷贝一份配置文件
`[root@cloud-1 conf]# cp sqoop-env-template.sh sqoop-env.sh`
打开并编辑sqoop-env.sh
`[root@cloud-1 conf]# vi sqoop-env.sh`
修改sqoop-env.sh的如下信息
```shell
export HADOOP_COMMON_HOME=/opt/hadoop
export HADOOP_MAPRED_HOME=/opt/hadoop
export HBASE_HOME=/opt/hbase
export HIVE_HOME=/opt/hive
```
注意:#export ZOOCFGDIR= #若你配置了ZooKeeper,也需要在此配置ZooKeeper的路径
3. 配置环境变量
打开当前用户的环境变量配置文件:
`[root@cloud-1 ~]# vi /etc/profile`
在配置文件末尾添加如下信息:
```shell
#SQOOP
export SQOOP_HOME=/opt/sqoop
export PATH=$PATH:$SQOOP_HOME/bin
export CLASSPATH=$CLASSPATH:$SQOOP_HOME/lib
```
保存该文件,退出vi编辑器。
然后,执行下面命令让配置文件立即生效:
`[root@cloud-1 ~]# source /etc/profile`
4. 将mysql驱动包拷贝到$SQOOP_HOME/lib
下面要把MySQL驱动程序拷贝到$SQOOP_HOME/lib目录下,首先使用xftp等工具将下载好的mysql驱动程序文件上传到root用户的主目录“/root”下面。
下面执行命令拷贝文件:
```shell
[root@cloud-1 ~]# cp mysql-connector-java.jar /opt/sqoop/lib/
# 将hive/lib中的hive-common.jar拷贝到sqoop的lib目录中
[root@cloud-1 ~]# cp /opt/hive/lib/hive-common-1.2.1.jar /opt/sqoop/lib/
```
5. 测试与MySQL的连接
首先请确保mysql服务已经启动了,如果没有启动,请执行下面命令启动:
`systemctl start mysql`
然后就可以测试sqoop与MySQL之间的连接是否成功:
```shell
sqoop list-databases --connect jdbc:mysql://127.0.0.1:3306/ --username root -P
```
mysql的数据库列表显示在屏幕上表示连接成功
# 错误
## 错误1
```shell
2022-07-25 11:05:17,633 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/lang/StringUtils
at org.apache.sqoop.manager.MySQLManager.initOptionDefaults(MySQLManager.java:73)
at org.apache.sqoop.manager.SqlManager.<init>(SqlManager.java:89)
at com.cloudera.sqoop.manager.SqlManager.<init>(SqlManager.java:33)
at org.apache.sqoop.manager.GenericJdbcManager.<init>(GenericJdbcManager.java:51)
at com.cloudera.sqoop.manager.GenericJdbcManager.<init>(GenericJdbcManager.java:30)
at org.apache.sqoop.manager.CatalogQueryManager.<init>(CatalogQueryManager.java:46)
at com.cloudera.sqoop.manager.CatalogQueryManager.<init>(CatalogQueryManager.java:31)
at org.apache.sqoop.manager.InformationSchemaManager.<init>(InformationSchemaManager.java:38)
at com.cloudera.sqoop.manager.InformationSchemaManager.<init>(InformationSchemaManager.java:31)
at org.apache.sqoop.manager.MySQLManager.<init>(MySQLManager.java:65)
at org.apache.sqoop.manager.DefaultManagerFactory.accept(DefaultManagerFactory.java:67)
at org.apache.sqoop.ConnFactory.getManager(ConnFactory.java:184)
at org.apache.sqoop.tool.BaseSqoopTool.init(BaseSqoopTool.java:272)
at org.apache.sqoop.tool.ListDatabasesTool.run(ListDatabasesTool.java:44)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.lang.StringUtils
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 20 more
```
前提:
hadoop 已经正常启动,并且mysql驱动包已经导入好
原因:
原因在异常中已经很明显了:没有找到类org.apache.commons.lang.StringUtils 也就是说Sqoop中没有这个类或者包含这个类的jar包。
解决办法:
将这个类所在的jar包添加到Sqoop中,这个类所在的jar包是 commons-lang
第一步 下载commons-lang: 下载地址: [点击下载](http://mirror.bit.edu.cn/apache//commons/lang/binaries/) 注意:这里面有两个版本 2 和 3 请优先使用 2 因为Sqoop版本大部分是支持2 的 (如果不行 再尝试3)
第二步 把下载下来的文件解压 :里面会得到好几个jar文件 我们需要的是 commons-lang-*.*.jar
第三步 把该jar包放到Sqoop 下的lib文件夹下面即可:
![image.png](https://cos.easydoc.net/52087651/files/l606ivnv.png)
第四步:输入这一串`sqoop list-databases --connect jdbc:mysql://127.0.0.1:3306/ --username root -P`试试
若能查看到数据库,表示成功
## 错误2
```shell
22/07/26 14:05:35 ERROR manager.SqlManager: Error executing statement: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:403)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:990)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:335)
at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2187)
at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2220)
at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2015)
at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:768)
at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:403)
at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:385)
at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:323)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:904)
at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:59)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:763)
at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:786)
at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:289)
at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:260)
at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:246)
at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:327)
at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1872)
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1671)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:106)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:501)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
Caused by: java.net.UnknownHostException: Hadoop01: 未知的名称或服务
at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324)
at java.net.InetAddress.getAllByName0(InetAddress.java:1277)
at java.net.InetAddress.getAllByName(InetAddress.java:1193)
at java.net.InetAddress.getAllByName(InetAddress.java:1127)
at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:188)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:299)
... 33 more
```
为了使 Sqoop 能够连接 Hive,需要将 hive 组件/usr/local/apache-hive-1.2.2-bin/lib/目录下的hive-common-1.2.2.jar也放入 Sqoop 安装路径的 lib 目录中。
[hadoop@master ~] cp /usr/local/src/hive/lib/hive-common-2.0.0.jar
/usr/local/src/sqoop/lib/