安装Sqoop

# 下载上传解压重命名 # 修改配置文件sqoop-env.sh 进行Sqoop安装目录中的conf目录 `[root@cloud-1 ~]# cd /opt/sqoop/conf/` 拷贝一份配置文件 `[root@cloud-1 conf]# cp sqoop-env-template.sh sqoop-env.sh` 打开并编辑sqoop-env.sh `[root@cloud-1 conf]# vi sqoop-env.sh` 修改sqoop-env.sh的如下信息 ```shell export HADOOP_COMMON_HOME=/opt/hadoop export HADOOP_MAPRED_HOME=/opt/hadoop export HBASE_HOME=/opt/hbase export HIVE_HOME=/opt/hive ``` 注意:#export ZOOCFGDIR= #若你配置了ZooKeeper,也需要在此配置ZooKeeper的路径 3. 配置环境变量 打开当前用户的环境变量配置文件: `[root@cloud-1 ~]# vi /etc/profile` 在配置文件末尾添加如下信息: ```shell #SQOOP export SQOOP_HOME=/opt/sqoop export PATH=$PATH:$SQOOP_HOME/bin export CLASSPATH=$CLASSPATH:$SQOOP_HOME/lib ``` 保存该文件,退出vi编辑器。 然后,执行下面命令让配置文件立即生效: `[root@cloud-1 ~]# source /etc/profile` 4. 将mysql驱动包拷贝到$SQOOP_HOME/lib 下面要把MySQL驱动程序拷贝到$SQOOP_HOME/lib目录下,首先使用xftp等工具将下载好的mysql驱动程序文件上传到root用户的主目录“/root”下面。 下面执行命令拷贝文件: ```shell [root@cloud-1 ~]# cp mysql-connector-java.jar /opt/sqoop/lib/ # 将hive/lib中的hive-common.jar拷贝到sqoop的lib目录中 [root@cloud-1 ~]# cp /opt/hive/lib/hive-common-1.2.1.jar /opt/sqoop/lib/ ``` 5. 测试与MySQL的连接 首先请确保mysql服务已经启动了,如果没有启动,请执行下面命令启动: `systemctl start mysql` 然后就可以测试sqoop与MySQL之间的连接是否成功: ```shell sqoop list-databases --connect jdbc:mysql://127.0.0.1:3306/ --username root -P ``` mysql的数据库列表显示在屏幕上表示连接成功 # 错误 ## 错误1 ```shell 2022-07-25 11:05:17,633 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/lang/StringUtils at org.apache.sqoop.manager.MySQLManager.initOptionDefaults(MySQLManager.java:73) at org.apache.sqoop.manager.SqlManager.<init>(SqlManager.java:89) at com.cloudera.sqoop.manager.SqlManager.<init>(SqlManager.java:33) at org.apache.sqoop.manager.GenericJdbcManager.<init>(GenericJdbcManager.java:51) at com.cloudera.sqoop.manager.GenericJdbcManager.<init>(GenericJdbcManager.java:30) at org.apache.sqoop.manager.CatalogQueryManager.<init>(CatalogQueryManager.java:46) at com.cloudera.sqoop.manager.CatalogQueryManager.<init>(CatalogQueryManager.java:31) at org.apache.sqoop.manager.InformationSchemaManager.<init>(InformationSchemaManager.java:38) at com.cloudera.sqoop.manager.InformationSchemaManager.<init>(InformationSchemaManager.java:31) at org.apache.sqoop.manager.MySQLManager.<init>(MySQLManager.java:65) at org.apache.sqoop.manager.DefaultManagerFactory.accept(DefaultManagerFactory.java:67) at org.apache.sqoop.ConnFactory.getManager(ConnFactory.java:184) at org.apache.sqoop.tool.BaseSqoopTool.init(BaseSqoopTool.java:272) at org.apache.sqoop.tool.ListDatabasesTool.run(ListDatabasesTool.java:44) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243) at org.apache.sqoop.Sqoop.main(Sqoop.java:252) Caused by: java.lang.ClassNotFoundException: org.apache.commons.lang.StringUtils at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ... 20 more ``` 前提: hadoop 已经正常启动,并且mysql驱动包已经导入好 原因: 原因在异常中已经很明显了:没有找到类org.apache.commons.lang.StringUtils 也就是说Sqoop中没有这个类或者包含这个类的jar包。 解决办法: 将这个类所在的jar包添加到Sqoop中,这个类所在的jar包是 commons-lang 第一步 下载commons-lang: 下载地址: [点击下载](http://mirror.bit.edu.cn/apache//commons/lang/binaries/) 注意:这里面有两个版本 2 和 3 请优先使用 2 因为Sqoop版本大部分是支持2 的 (如果不行 再尝试3) 第二步 把下载下来的文件解压 :里面会得到好几个jar文件 我们需要的是 commons-lang-*.*.jar 第三步 把该jar包放到Sqoop 下的lib文件夹下面即可: ![image.png](https://cos.easydoc.net/52087651/files/l606ivnv.png) 第四步:输入这一串`sqoop list-databases --connect jdbc:mysql://127.0.0.1:3306/ --username root -P`试试 若能查看到数据库,表示成功 ## 错误2 ```shell 22/07/26 14:05:35 ERROR manager.SqlManager: Error executing statement: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server. com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at com.mysql.jdbc.Util.handleNewInstance(Util.java:403) at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:990) at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:335) at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2187) at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2220) at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2015) at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:768) at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at com.mysql.jdbc.Util.handleNewInstance(Util.java:403) at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:385) at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:323) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:247) at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:904) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:59) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:763) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:786) at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:289) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:260) at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:246) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:327) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1872) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1671) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:106) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:501) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243) at org.apache.sqoop.Sqoop.main(Sqoop.java:252) Caused by: java.net.UnknownHostException: Hadoop01: 未知的名称或服务 at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) at java.net.InetAddress.getAllByName0(InetAddress.java:1277) at java.net.InetAddress.getAllByName(InetAddress.java:1193) at java.net.InetAddress.getAllByName(InetAddress.java:1127) at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:188) at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:299) ... 33 more ``` 为了使 Sqoop 能够连接 Hive,需要将 hive 组件/usr/local/apache-hive-1.2.2-bin/lib/目录下的hive-common-1.2.2.jar也放入 Sqoop 安装路径的 lib 目录中。 [hadoop@master ~] cp /usr/local/src/hive/lib/hive-common-2.0.0.jar /usr/local/src/sqoop/lib/