首页 文章

采取具有特定均值的样本

提问于
浏览
3

假设我有一个像{1,2,3,...,23}这样的人口,我想生成一个样本,以便样本的平均值等于6 .

我尝试使用 sample 函数,使用自定义概率向量,但它不起作用:

population <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23)
mean(population)
minimum <- min(population)
maximum <- max(population)
amplitude <- maximum - minimum 
expected <- 6
n <- length(population)
prob.vector = rep(expected, each=n)
for(i in seq(1, n)) {
  if(expected > population[i]) {
    prob.vector[i] <- (i - minimum) / (expected - minimum)
  } else {
    prob.vector[i] <- (maximum - i) / (maximum - expected)
  }
}
sample.size <- 5
sample <- sample(population, sample.size, prob = prob.vector)
mean(sample)

样本的平均值大约是人口的平均值(振荡大约12),我希望它大约是6 .

一个很好的样本是:

  • {3,5,6,8,9},平均值= 6.2

  • {2,3,4,8,9},平均值= 5.6

问题与sample integer values in R with specific mean不同,因为我有一个特定的人口,我不能只生成任意实数,他们必须在人口中 .

概率向量图:
plot

1 回答

  • 2

    你可以试试这个:

    m = local({b=combn(1:23,5);
               d = colMeans(b);
               e = b[,d>5.5 &d<6.5];
               function()sample(e[,sample(ncol(e),1)])})
    m()
    [1] 8 5 6 9 3
    m()
    [1]  6  4  5  3 13
    

    分解:

    b=combn(1:23,5) # combine the numbers into 5
    d = colMeans(b) # find all the means
    e = b[,d>5.5 &d<6.5] # select only the means that are within a 0.5 range of 6
    sample(e[,sample(ncol(e),1)]) # sample the values the you need
    

相关问题