首页 文章

模拟相关变量,限制观察和定义的相关系数之间的偏差

提问于
浏览
2
dev_allowance <- 0.15 #Deviation in r allowed
    within_limit <- FALSE #Initiate
    count <- 0            #Loop count
    nvar <- 10            #number of variables to simulate
    nobs = 50             #number of observations to simulate
    #define correlation matrix
    M = matrix(c(1., .0, .0, .0, .0, .0, .0, .0, .0, .0,
                 .0, 1., .0, .0, .0, .0, .0, .0, .0, .0,
                 .0, .0, 1., .8, .0, .0, .0, .0, .0, .0,
                 .0, .0, .8, 1., .0, .0, .0, .0, .0, .0,
                 .0, .0, .0, .0, 1., .2, .0, .0, .0, .0,
                 .0, .0, .0, .0, .2, 1., .0, .0, .0, .0,
                 .0, .0, .0, .0, .0, .0, 1., .8, .0, .0,
                 .0, .0, .0, .0, .0, .0, .8, 1., .0, .0,
                 .0, .0, .0, .0, .0, .0, .0, .0, 1., .2,
                 .0, .0, .0, .0, .0, .0, .0, .0, .2, 1.), nrow=nvar, ncol=nvar)
    L = chol(M)           # Cholesky decomposition

    #Loop while not within limit
    while (!within_limit) {
      # Generate random variables
        r = t(L) %*% matrix(rnorm(nvars*nobs), nrow=nvars, ncol=nobs)
        r = t(r)
      # Check if within limit
        within_limit <- all(abs(cor(r) - M) < dev_allowance)
      # Count loop
        count <- count + 1
    }

    cat(paste0("run count: ", count))

我试图用定义的相关性模拟大约10个随机正态变量 . 同时,我希望模拟变量的相关性在以定义的相关性为中心的特定范围内 .

但运行时间是不可接受的,如果不是无限长的话 .

现在,我想做 nobs=50nobs=200 . 虽然我计划设置 dev_allowance=0.05 ,但我现在的情况是,当 dev_allowance 小于约时,它可能需要一分多钟 . 0.16表示 nobs=50 和约 . nobs=200 为0.08 . 不敢尝试更小 dev_allowance ...

如果我坚持这个当前的参数方案,是否有解决方法?

1 回答

  • 0

    嗯...在我脑海中输入这个问题的中途:

    sim_nvar <- matrix(rnorm(nobs), ncol=nobs)
        for (i in 2:nvar) {
          within_limit <- FALSE
          while (!within_limit) {
            #Generate random variables
              sim_var <- t(L)[i, 1:i] %*% rbind(sim_nvar, matrix(rnorm(nobs), ncol=nobs))
              sim_var <- t(rbind(sim_nvar, sim_var))
            #Check if within limit
              within_limit <- all(abs(cor(sim_var) - M[1:i, 1:i]) < dev_allowance)
          }
          sim_nvar <- t(sim_var)
        }
        sim_nvar <- t(sim_nvar)
    
        all(abs(cor(sim_nvar) - M) < dev_allowance)
        [1] TRUE
    

    对我来说似乎没问题 . 但如果我以这种方式分离模拟,有什么缺陷吗?或者这是最好的方式呢?

相关问题