我一直在使用R中的glm()来计算控制单个二项式绘制的logit概率参数的置信区间 .

P <- 20 # Number of successes
D <- 1  # Number of failures
model1 <- glm(matrix(c(P,D), nrow=1) ~ 1, family="binomial") # Successes modeled as binomial draw from successes+failures
summary(model1)
confint(model1)  # Confidence interval on binomial parameter p

我注意到,如果成功或失败的次数为零( P=0D=0 ),则返回的置信区间毫无意义 .

然后我通过数值积分归一化二项式似然来计算我自己的置信区间:

binomial_fun <- function(p) {choose(N, K)*(p^K)*(1-p)^(N-K)}  # A binomial function
logit_fun <- function(p) {log(p/(1-p))}                       # A logit function
f_upper <- function(a) {integrate(binomial_fun, 0, a )$value/integrate(binomial_fun, 0, 1 )$value - .975}
f_lower <- function(a) {integrate(binomial_fun, a, 1 )$value/integrate(binomial_fun, 0, 1 )$value - .975}
# These functions will take value zero when a corresponds to the 97.5% and 2.5% points
# of the normalized binomial likelihood given N and K.

N <- P+D     # N is the number of trials
K <- P       # K is the number of successes
uci2 <- logit_fun(uniroot( f_upper, c(0, 1) )$root) # Look for a solution in [0,1]
lci2 <- logit_fun(uniroot( f_lower, c(0, 1) )$root) # Look for a solution in [0,1]

这给出了有意义的置信区间 P=0D=0 . 但是,它给出了 glm() confint() 的不同置信区间,即使 PD 都不为零 .

confint(model1)
c(lci2, uci2)

glm() confint() 相比,我计算的lci和uci都倾向于接近于零 .

如何 confint() 计算间隔?我确信在更复杂的glms的性能方面这样做有一个很好的理由,但在这个简单的情况下它似乎是一个奇怪的结果 .