首页 文章

扫描错误(文件,什么,nmax,sep,dec,quote,skip,nlines,na.strings,:第1行没有2个元素

提问于
浏览
0
examdata <- RCurl::getURL("https://raw.githubusercontent.com/jrwolf/IT497/master/spendingdata.txt")

examdata2 <- read.table(textConnection(examdata), sep = ",", header = T)

扫描错误(文件,什么,nmax,sep,dec,quote,skip,nlines,na.strings,:第1行没有2个元素

2 回答

  • 7

    看起来你只需要跳过几行 . 我使用 readLines(textConnection(examdata)) 来确定实际数据表的开始位置 . 原来它从第32行开始 . 因此,我们可以使用 read.csv 中的 skip 参数跳过前31行 . 我使用了 strip.white 参数,因为表中似乎有一些错误的空格 .

    (df <- read.csv(text = examdata, skip = 31L, strip.white = TRUE))
    #                          Type Cash Check Credit Debit Electronic Other Total
    # 1 Average Number of Purchases 23.7   3.9   10.1  14.4        4.4   2.3  58.7
    # 2   Average Transaction Value  $21  $168    $56   $44       $216   $69   $59
    # 3      Value of Payments in %   14    19     16    18         27     5   100
    

    由于'll probably want those numbers to be numeric, you'll需要删除 $ 符号并将列转换为数字,因此您可以将它们用于以后可能执行的任何计算 .

    df[-1] <- lapply(df[-1], function(x) as.numeric(sub("[$]", "", x)))
    df
    #                          Type Cash Check Credit Debit Electronic Other Total
    # 1 Average Number of Purchases 23.7   3.9   10.1  14.4        4.4   2.3  58.7
    # 2   Average Transaction Value 21.0 168.0   56.0  44.0      216.0  69.0  59.0
    # 3      Value of Payments in % 14.0  19.0   16.0  18.0       27.0   5.0 100.0
    

    现在除了第一列之外的所有列都是数字 .

  • 0

    read.tableread.csv 会将URL作为路径并为您处理连接,因此您不需要 RCurl

    read.csv("https://raw.githubusercontent.com/jrwolf/IT497/master/spendingdata.txt", 
             skip = 31)
    
    ##                          Type Cash Check Credit Debit Electronic Other Total
    ## 1 Average Number of Purchases 23.7   3.9   10.1  14.4        4.4   2.3  58.7
    ## 2   Average Transaction Value  $21  $168    $56   $44       $216   $69   $59
    

    此外,如果您使用 readr::read_csv ,您可以告诉它将列解析为数字,在读取时删除 $ 个字符:

    library(readr)
    
    read_csv("https://raw.githubusercontent.com/jrwolf/IT497/master/spendingdata.txt", 
             skip = 31, 
             col_types = cols(Type = 'c', .default = 'n'))    # c = character, n = number
    
    ## # A tibble: 2 × 8
    ##                          Type  Cash Check Credit Debit Electronic Other Total
    ##                         <chr> <dbl> <dbl>  <dbl> <dbl>      <dbl> <dbl> <dbl>
    ## 1 Average Number of Purchases  23.7   3.9   10.1  14.4        4.4   2.3  58.7
    ## 2   Average Transaction Value  21.0 168.0   56.0  44.0      216.0  69.0  59.0
    

相关问题