扫描错误（文件，什么，nmax，sep，dec，quote，skip，nlines，na.strings，：第1行没有2个元素-Java 学习之路

examdata <- RCurl::getURL("https://raw.githubusercontent.com/jrwolf/IT497/master/spendingdata.txt")

examdata2 <- read.table(textConnection(examdata), sep = ",", header = T)

扫描错误（文件，什么，nmax，sep，dec，quote，skip，nlines，na.strings，：第1行没有2个元素

2 回答

看起来你只需要跳过几行 . 我使用 readLines(textConnection(examdata)) 来确定实际数据表的开始位置 . 原来它从第32行开始 . 因此，我们可以使用 read.csv 中的 skip 参数跳过前31行 . 我使用了 strip.white 参数，因为表中似乎有一些错误的空格 .

(df <- read.csv(text = examdata, skip = 31L, strip.white = TRUE))
#                          Type Cash Check Credit Debit Electronic Other Total
# 1 Average Number of Purchases 23.7   3.9   10.1  14.4        4.4   2.3  58.7
# 2   Average Transaction Value  $21  $168    $56   $44       $216   $69   $59
# 3      Value of Payments in %   14    19     16    18         27     5   100

由于'll probably want those numbers to be numeric, you'll需要删除 $ 符号并将列转换为数字，因此您可以将它们用于以后可能执行的任何计算 .

df[-1] <- lapply(df[-1], function(x) as.numeric(sub("[$]", "", x)))
df
#                          Type Cash Check Credit Debit Electronic Other Total
# 1 Average Number of Purchases 23.7   3.9   10.1  14.4        4.4   2.3  58.7
# 2   Average Transaction Value 21.0 168.0   56.0  44.0      216.0  69.0  59.0
# 3      Value of Payments in % 14.0  19.0   16.0  18.0       27.0   5.0 100.0

现在除了第一列之外的所有列都是数字 .

回复于 2024-05-04T16:58:07+08:00

read.table 和 read.csv 会将URL作为路径并为您处理连接，因此您不需要 RCurl ：

read.csv("https://raw.githubusercontent.com/jrwolf/IT497/master/spendingdata.txt", 
         skip = 31)

##                          Type Cash Check Credit Debit Electronic Other Total
## 1 Average Number of Purchases 23.7   3.9   10.1  14.4        4.4   2.3  58.7
## 2   Average Transaction Value  $21  $168    $56   $44       $216   $69   $59

此外，如果您使用 readr::read_csv ，您可以告诉它将列解析为数字，在读取时删除 $ 个字符：

library(readr)

read_csv("https://raw.githubusercontent.com/jrwolf/IT497/master/spendingdata.txt", 
         skip = 31, 
         col_types = cols(Type = 'c', .default = 'n'))    # c = character, n = number

## # A tibble: 2 × 8
##                          Type  Cash Check Credit Debit Electronic Other Total
##                         <chr> <dbl> <dbl>  <dbl> <dbl>      <dbl> <dbl> <dbl>
## 1 Average Number of Purchases  23.7   3.9   10.1  14.4        4.4   2.3  58.7
## 2   Average Transaction Value  21.0 168.0   56.0  44.0      216.0  69.0  59.0

回复于 2024-05-04T16:58:07+08:00

扫描错误（文件，什么，nmax，sep，dec，quote，skip，nlines，na.strings，：第1行没有2个元素

2 回答

相关问题