首页 文章

动态更改数据框的数据类型

提问于
浏览
-1

我有一组数据框属于许多国家,包括3个变量(年,AI,OAD) . 津巴布韦的例子如下所示,

>str(dframe_Zimbabwe_1955_1970)
'data.frame':   16 obs. of  3 variables:
 $ year: chr  "1955" "1956" "1957" "1958" ...
 $ AI  : chr  "11.61568161" "11.34114927" "11.23639317" "11.18841409" ...
 $ OAD : chr  "5.740789488" "5.775882473" "5.800441036" "5.822536579" ...

我试图将数据框中变量的数据类型更改为下面,以便我可以使用 lm(dframe_Zimbabwe_1955_1970$AI ~ dframe_Zimbabwe_1955_1970$year) 建模线性拟合 .

>str(dframe_Zimbabwe_1955_1970) 
'data.frame':   16 obs. of  3 variables:
 $ year: int  1955 1956 1957 1958 ...
 $ AI  : num  11.61568161 11.34114927 11.23639317 11.18841409 ...
 $ OAD : num  5.740789488 5.775882473 5.800441036 5.822536579 ...

下面的静态代码能够将AI从字符(chr)更改为数字(num) .

dframe_Zimbabwe_1955_1970$AI <- as.numeric(dframe_Zimbabwe_1955_1970$AI)

但是,当我尝试自动化代码时,AI仍然是字符(chr)

countries <- c('Zimbabwe', 'Afghanistan', ...) 

for (country in countries) {
  assign(paste('dframe_',country,'_1955_1970$AI', sep=''), eval(parse(text = paste('as.numeric(dframe_',country,'_1955_1970$AI)', sep=''))))
}

你能告诉我可能做错了什么吗?

谢谢 .

2 回答

  • 1

    它会被纯粹主义者认为是相当丑陋的代码,但也许这样:

    for (country in countries) {
    
        new_val <- get(paste('dframe_',country,'_1955_1970', sep=''))
        new_val[] <- lapply(new_val, as.numeric)  # the '[]' on LHS keeps dataframe
        assign(paste('dframe_',country,'_1955_1970', sep=''), new_val)
              }
    

    使用 get('obj_name') 函数被认为比 eval(parse(text=...)) 更干净 . 如果您将这些数据帧组合在一个列表中,它会被处理得更多 .

  • 2

    42:您的代码无法按照书面形式运行,但会进行一些编辑 . 除了缺少括号和错误的sep之外,你不能在赋值中使用$'列名',但你还是不需要它

    for (country in countries) {
      new_val <- get(paste( 'dframe_',country,'_1955_1970', sep=''))
      new_val[] <- lapply(new_val, as.numeric)  # the '[]' on LHS keeps dataframe
      assign(paste('dframe_',country,'_1955_1970', sep=''), new_val)
      remove(new_val)
    }
    

    证明它有效:

    dframe_Zimbabwe_1955_1970 <- data.frame(year = c("1955", "1956", "1957"), 
                                             AI = c("11.61568161", "11.34114927", "11.23639317"),
                                             OAD = c("5.740789488", "5.775882473", "5.800441036"),
                                             stringsAsFactors = F)
    str(dframe_Zimbabwe_1955_1970)
    'data.frame':   3 obs. of  3 variables:
     $ year: chr  "1955" "1956" "1957"
     $ AI  : chr  "11.61568161" "11.34114927" "11.23639317"
     $ OAD : chr  "5.740789488" "5.775882473" "5.800441036"
    
     countries <- 'Zimbabwe'
     for (country in countries) {
     new_val <- get(paste( 'dframe_',country,'_1955_1970', sep=''))
       new_val[] <- lapply(new_val, as.numeric)  # the '[]' on LHS keeps dataframe
       assign(paste('dframe_',country,'_1955_1970', sep=''), new_val)
       remove(new_val)
     }
    
    str(dframe_Zimbabwe_1955_1970)
    'data.frame':   3 obs. of  3 variables:
     $ year: num  1955 1956 1957
     $ AI  : num  11.6 11.3 11.2
     $ OAD : num  5.74 5.78 5.8
    

相关问题