无法使用Sqoop将数据从Hive导出到MySQL-Java 学习之路

我正在使用Sqoop将处理后的数据从以Hive格式存储的HDFS导出到MySQL服务器中 . 代码简单明了，但无论我做什么，Sqoop都无法正确识别字段分隔符 . 可能是什么问题？

这是我在Hive中的表定义

hive> show create table database.weblog_ag;

OK
CREATE  TABLE database.weblog_ag(
  visitor_id string,
  time array<string>,
  url array<string>,
  client_time array<string>,
  resolution array<string>,
  browser array<string>,
  os array<string>,
  devicetype array<string>,
  devicemodel array<string>,
  ipinfo array<string>
CLUSTERED BY (
  visitor_id)
SORTED BY (
  time ASC)
INTO 32 BUCKETS
ROW FORMAT DELIMITED
  FIELDS TERMINATED BY '\t'
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  'hdfs://poc/apps/hive/warehouse/database.db/weblog_ag'
TBLPROPERTIES (
  'numPartitions'='0',
  'numFiles'='96',
  'transient_lastDdlTime'='1390411893',
  'totalSize'='59633487',
  'numRows'='0',
  'rawDataSize'='0')
Time taken: 1.871 seconds, Fetched: 31 row(s)

当我在HDFS中检查文件时，使用 \t （制表符）字符正确分隔字段 . 这是我从HDFS中获取的示例数据

101009a36b3113fa        2014-01-06 08:59:58     http://someurl    2014-01-06 08:56:53     1280x800        Chrome  Windows XP      General_Desktop Other   115.74.215.116

这是我的Sqoop选项文件配置

export

--connect
jdbc:mysql://webserver/fprofile_db

--username
username

--password
password

--table
weblog

--direct

--export-dir
/apps/hive/warehouse/database.db/weblog_ag

--input-fields-terminated-by
'\011'

--columns
visitor_id, time, url, client_time, resolution, browser, os, devicetype, devicemodel, ipinfo

我试图使用 '\011 ， \t 作为 --input-fields-terminated-by 参数，但它们都不起作用 . mySQL中导出的结果如下：

enter image description here
这可能是什么问题？

3 回答

0
即使您正在导出，您实际上也需要使用
```
--fields-terminated-by
'\t'
```
回复于 2024-05-11T19:08:47+08:00
0
我发现SQOOP与mysql直接模式忽略了我的 --input-fields-terminated-by 并始终使用 0x2c （逗号） .

当我使用SQOOP的直接模式用于mysql时，它会生成如下查询：
```
LOAD DATA LOCAL INFILE '/yarn/nm/usercache/hdfs/appcache/application_12345/somefile.txt'
    INTO TABLE mytable 
FIELDS TERMINATED BY 0x2c 
LINES TERMINATED BY 0xa 
IGNORE 0 LINES (field1, field2, ...)
```
您可以看到它在终止的字段中指定它 .
回复于 2024-05-11T19:08:47+08:00
3

所以在一天结束时，问题的罪魁祸首是 --direct 选项 . 我删除了它，一切正常 .

回复于 2024-05-11T19:08:47+08:00

无法使用Sqoop将数据从Hive导出到MySQL

3 回答

相关问题