首页 文章

如何在R中绘制Logistic回归(LASSO)的ROC曲线?

提问于
浏览
0

我将逻辑回归模型拟合到R中的训练数据集,更具体地说是具有L1惩罚的LASSO回归 . 我使用了 glmnet 包 . 模型的代码如下所示 .

t1 <- Sys.time()
glmnet_classifier <- cv.glmnet(x = dtm_train_tfidf,
                           y = tweets_train[['sentiment']], 
                           family = 'binomial', 
                           # L1 penalty
                           alpha = 1,
                           # interested in the area under ROC curve
                           type.measure = "auc",
                           # 5-fold cross-validation
                           nfolds = 5,
                           # high value is less accurate, but has faster training
                           thresh = 1e-3,
                           # again lower number of iterations for faster training
                           maxit = 1e3)
print(difftime(Sys.time(), t1, units = 'mins'))

preds <- predict(glmnet_classifier, dtm_test_tfidf, type = 'response')[ ,1]

现在我想绘制ROC曲线 . 但是,我无法弄清楚如何准确地绘制它 .

当我 plot(glmnet_classifier) 这是我收到的:
Plot of classifier

由于这不是Roc曲线,我想知道是否有人知道如何在R中绘制它?我已经提到了 ROCR 包,但它给了我一个错误:

roc.perf = performance(preds, measure = "tpr", x.measure = "fpr")

有人可以帮忙吗?非常感谢你!

2 回答

  • 2
    library(pROC)
    data("aSAH")
    
    fit <- glm(outcome ~ gender + age + wfns + s100b , data = aSAH, family = binomial)
    
     roc(aSAH$outcome, as.vector(fitted.values(fit)), percent=F,   boot.n=1000, ci.alpha=0.9, stratified=FALSE, plot=TRUE, grid=TRUE, show.thres=TRUE, legacy.axes = TRUE, reuse.auc = TRUE,
    # print.thres = c(0.30,0.35, 0.40, 0.45,0.48, 0.50,0.55, 0.60),#
    print.auc = TRUE, print.thres.col = "blue", ci=TRUE, ci.type="bars", print.thres.cex = 0.7, main = paste("ROC curve using","(N = ",nrow(aSAH),")") )
    

    enter image description here

    我希望它有所帮助;)

  • 5

    您对 ROCR 的问题在于您直接在预测上使用 performance 而不是在标准化预测对象上 . 以下是如何绘制ROC曲线的示例

    library(ggplot2) # For diamonds data
    library(ROCR) # For ROC curves
    library(glmnet) # For regularized GLMs
    
    
    # Classification problem
    class <- diamonds$price > median(diamonds$price) # The top 50% valued diamonds
    X <- as.matrix(diamonds[, c('carat', 'depth', 'x', 'y', 'z')]) # Predictor variables
    
    # L1 regularized logistic regression
    fit <- cv.glmnet(x = X, y = class, family = 'binomial', type.measure = 'class', alpha = 1)
    
    # Predict from model
    preds <- predict(fit, newx = X, type = 'response')
    
    # ROCR for ROC curve
    library(ROCR)
    # Calculate true positive rate and false positive rate on the prediction object
    perf <- performance(prediction(preds, class), 'tpr', 'fpr')
    
    plot(perf)
    

    ROC curve

相关问题