首页 文章

具有ddply的面板中的滞后变量

提问于
浏览
1

我试图在实质上是面板数据集中生成精度变化(基于估计的置信区间) .

因此,作为一个简单的例子,这里是我编写的函数并将其应用于非感性示例....

precision.gain <- function(x){
  x        <- ts(x, start=x[1])
  x.length <- seq(length = length(x))
  x.lag    <- lag(x, -1)
  x.gain   <- ((x - x.lag) * 100) / x
  x.gain   <- c(NA, x.gain)
  x.gain
}
t <- data.frame(x=1:20)
t <- cbind(t, precision.gain(t$x))
t
x precision.gain(t$x)
1   1                  NA
2   2           50.000000
3   3           33.333333
4   4           25.000000
5   5           20.000000 
6   6           16.666667
7   7           14.285714
8   8           12.500000
9   9           11.111111
10 10           10.000000
11 11            9.090909
12 12            8.333333
13 13            7.692308
14 14            7.142857
15 15            6.666667
16 16            6.250000
17 17            5.882353
18 18            5.555556
19 19            5.263158
20 20            5.000000

这很有效,但是我遇到了麻烦(或者更可能是错误的理解)然后(t?)将这个应用到我的数据框中,其样本是....

subset(results.normal.sum, n2 > 20 & n2 < 30, select=c(sd2, n2, ci.width1))
    sd2 n2 ci.width1
11  0.4 22 0.6528714
12  0.4 24 0.6167015
13  0.4 26 0.5895856
14  0.4 28 0.5658297
46  0.6 22 0.6529126
47  0.6 24 0.6196544
48  0.6 26 0.5922061
49  0.6 28 0.5642688
81  0.8 22 0.6513849
82  0.8 24 0.6194468
83  0.8 26 0.5923094
84  0.8 28 0.5636396
116 1.0 22 0.6522927
117 1.0 24 0.6191043
118 1.0 26 0.5900129
119 1.0 28 0.5652429
151 1.2 22 0.6518072
152 1.2 24 0.6193353
153 1.2 26 0.5892683
154 1.2 28 0.5632235
186 1.4 22 0.6527031
187 1.4 24 0.6191458
188 1.4 26 0.5899453
189 1.4 28 0.5640431
221 1.6 22 0.6521401
222 1.6 24 0.6191883
223 1.6 26 0.5893458
224 1.6 28 0.5637215
256 1.8 22 0.6512491
257 1.8 24 0.6180401
258 1.8 26 0.5905810
259 1.8 28 0.5647388
291 2.0 22 0.6515769
292 2.0 24 0.6183121
293 2.0 26 0.5896990
294 2.0 28 0.5663394

我尝试过使用Hadley Wickham的plyr包中的ddply().....

ddply(results.normal.sum, .(sd2), precision.gain, x=ci.width1)
Error in .fun(piece, ...) : unused argument(s) (piece)

直接使用tapply()我会到达那里,但它不会返回一个可以是cbind()的数据框....

> tapply(results.normal.sum$ci.width1, sd2, precision.gain)
$`0.4`
 [1]          NA -771.332292  -68.852635  -30.514545  -19.877447  -14.515380
 [7]  -11.147183   -9.282641   -7.680418   -6.836209   -5.954992   -5.865053
[13]   -4.599158   -4.198409   -4.155838   -3.529773   -3.590234   -3.432364
[19]   -2.899601   -3.092533   -2.721967   -2.506706   -2.498318   -2.321500
[25]   -2.299822   -2.187855   -2.116990   -1.896162   -1.853487   -1.604902
[31]   -2.194138   -1.473042   -1.710051   -1.701994   -1.417754

$`0.6`
 [1]          NA -756.196418  -68.222048  -30.566420  -19.216860  -15.162929
 [7]  -10.645899   -9.628775   -7.326799   -7.178820   -5.770681   -5.367216
[13]   -4.634938   -4.951049   -3.949776   -3.761633   -3.326209   -3.387764
[19]   -3.009317   -3.074398   -2.397660   -2.678573   -2.626077   -2.268373
[25]   -2.426720   -1.956498   -2.119986   -1.859410   -1.992678   -1.707448
[31]   -1.991583   -1.595951   -1.765913   -1.415065   -1.655725
....

我觉得我很亲密,但我失踪或误解了一些东西 .

我发现了一个类似的问题here,但只是不明白提供的答案/解决方案 .

在此先感谢您的帮助,

slackline

1 回答

  • 1

    如果我猜对了你需要什么,以下是一个利用 data.table 中方便的 := 运算符的解决方案 .

    首先阅读样本数据:

    testData <- textConnection("sd2 n2 ci.width1
    11  0.4 22 0.6528714
    12  0.4 24 0.6167015
    13  0.4 26 0.5895856
    14  0.4 28 0.5658297
    46  0.6 22 0.6529126
    47  0.6 24 0.6196544
    48  0.6 26 0.5922061
    49  0.6 28 0.5642688
    81  0.8 22 0.6513849
    82  0.8 24 0.6194468
    83  0.8 26 0.5923094
    84  0.8 28 0.5636396
    116 1.0 22 0.6522927
    117 1.0 24 0.6191043
    118 1.0 26 0.5900129
    119 1.0 28 0.5652429
    151 1.2 22 0.6518072
    152 1.2 24 0.6193353
    153 1.2 26 0.5892683
    154 1.2 28 0.5632235
    186 1.4 22 0.6527031
    187 1.4 24 0.6191458
    188 1.4 26 0.5899453
    189 1.4 28 0.5640431
    221 1.6 22 0.6521401
    222 1.6 24 0.6191883
    223 1.6 26 0.5893458
    224 1.6 28 0.5637215
    256 1.8 22 0.6512491
    257 1.8 24 0.6180401
    258 1.8 26 0.5905810
    259 1.8 28 0.5647388
    291 2.0 22 0.6515769
    292 2.0 24 0.6183121
    293 2.0 26 0.5896990
    294 2.0 28 0.5663394")
    

    然后,将数据放入 data.table 并...

    library(data.table)
    dt <- data.table(read.table(testData, header = TRUE))
    dt[, list(n2, ci.width1, prec.gain = precision.gain(ci.width1)), by = sd2]
    

    这是输出

    > dt[, list(n2, ci.width1, prec.gain = precision.gain(ci.width1)), by = sd2]
       sd2 n2 ci.width1 prec.gain
       0.4 22 0.6528714        NA
       0.4 24 0.6167015 -5.865058
       0.4 26 0.5895856 -4.599146
       0.4 28 0.5658297 -4.198419
       0.6 22 0.6529126        NA
       0.6 24 0.6196544 -5.367218
       0.6 26 0.5922061 -4.634924
       0.6 28 0.5642688 -4.951062
       0.8 22 0.6513849        NA
       0.8 24 0.6194468 -5.155907
       0.8 26 0.5923094 -4.581626
       0.8 28 0.5636396 -5.086548
         1 22 0.6522927        NA
         1 24 0.6191043 -5.360712
         1 26 0.5900129 -4.930638
         1 28 0.5652429 -4.382187
       1.2 22 0.6518072        NA
       1.2 24 0.6193353 -5.243024
       1.2 26 0.5892683 -5.102430
       1.2 28 0.5632235 -4.624239
       1.4 22 0.6527031        NA
       1.4 24 0.6191458 -5.419935
       1.4 26 0.5899453 -4.949696
       1.4 28 0.5640431 -4.592238
       1.6 22 0.6521401        NA
       1.6 24 0.6191883 -5.321774
       1.6 26 0.5893458 -5.063666
       1.6 28 0.5637215 -4.545560
       1.8 22 0.6512491        NA
       1.8 24 0.6180401 -5.373276
       1.8 26 0.5905810 -4.649506
       1.8 28 0.5647388 -4.575956
         2 22 0.6515769        NA
         2 24 0.6183121 -5.379937
         2 26 0.5896990 -4.852153
         2 28 0.5663394 -4.124664
    cn sd2 n2 ci.width1 prec.gain
    

相关问题