未定义的列选择v . 重复'row.names'是不允许的-Java 学习之路

在for循环中，我试图在我的数据框中的两列数据之间运行一个函数，并在循环的每个交互中移动到另一个数据集 . 我想将for循环的每个输出输出到一个答案向量中 .

我无法通过以下错误（在我的代码下面列出），具体取决于我是否添加或删除了row.names = NULL到 data <- read.csv... 以下代码的一部分（for循环的第4行）：

**编辑包含目录引用，其中错误最终是：

corr <- function(directory, threshold = 0) {
  source("complete.R")

The above code/ my unseen directory organzation was where my error was

lookup <- complete("specdata")
  setwd(paste0(getwd(),"/",directory,sep=""))
  files <-list.files(full.names="TRUE") #read file names
  len <- length(files)   
  answer2 <- vector("numeric") 
  answer <- vector("numeric")
  dataN <- data.frame()
      for (i in 1:len) {
          if (lookup[i,"nobs"] > threshold){
               # TRUE -> read that file, remove the NA data and add to the overall data frame
               data <- read.csv(file = files[i], header = TRUE, sep = ",")
               #remove incomplete
               dataN <- data[complete.cases(data),]
               #If yes, compute the correlation and assign its results to an intermediate vector.

        answer<-cor(dataN[,"sulfate"],dataN[,"nitrate"])
        answer2 <- c(answer2,answer)
      }
    }

setwd（“../”）return（answer2）}

1）read.table中的错误（file = file，header = header，sep = sep，quote = quote，：不允许重复'row.names'

对比）

2） [.data.frame （数据，2：3）中的错误：选择了未定义的列

What I've tried

直接引用列名"colA"
初始化数据和dataN以在for循环之前清空data.frames
将answer2初始化为空向量
更好地了解向量，矩阵和data.frames如何相互协作

** 谢谢！**

2 回答

0

我的问题是，我在上面的代码中引用了函数.R文件，与我循环和分析的数据文件位于同一目录中 . 我的“文件”向量是一个不正确的长度，因为它正在读取我在函数前面制作和引用的另一个.R函数 . 我相信这个R文件创建了'未定义的列'

我道歉，我最终甚至没有提出问题所在的正确代码区域 .

关键点：您可以随时在函数内的目录之间移动！实际上，如果要对感兴趣的目录的所有内容执行功能，可能非常有必要

回复于 2024-04-30T22:17:52+08:00

一种方法：

# get the list of file names
files <- list.files(path='~',pattern='*.csv',full.names = TRUE)

# load all files
list.data <- lapply(files,read.csv, header = TRUE, sep = ",", row.names = NULL)

# remove rows with NAs
complete.data <- lapply(list.data,function(d) d[complete.cases(d),])

# compute correlation of the 2nd and 3rd columns in every data set
answer <- sapply(complete.data,function(d) cor(d[,2],d[,3]))

同样的想法，但实现略有不同

cr <- function(fname) {
    d <- read.csv(fname, header = TRUE, sep = ",", row.names = NULL)
    dc <- d[complete.cases(d),]
    cor(dc[,2],dc[,3])
}
answer2 <- sapply(files,cr)

CSV文件的示例：

# ==> a.csv <==
#     a,b,c,d
# 1,2,3,4
# 11,12,13,14
# 11,NA,13,14
# 11,12,13,14
# 
# ==> b.csv <==
#     A,B,C,D
# 101,102,103,104
# 101,102,103,104
# 11,12,13,14

回复于 2024-04-30T22:17:52+08:00

未定义的列选择v . 重复'row.names'是不允许的

2 回答

相关问题