奇怪的ROC曲线预测-Java 学习之路

我有一个来自svm模型的以下预测（prediction_svm_linear），我想用R中的pROC包绘制ROC曲线 . 我得到AUC 100％，这是不可能的，因为基于混淆矩阵我没有完美的预测 . 显然我遗漏了一些东西，可能我不完全理解ROC曲线是如何工作的，你能不能向我解释为什么会发生这种情况？

Confusion Matrix and Statistics

      Reference
Prediction Cancer Normal
Cancer     11      0
Normal      3      5

           Accuracy : 0.8421          
             95% CI : (0.6042, 0.9662)
No Information Rate : 0.7368          
P-Value [Acc > NIR] : 0.2227          

              Kappa : 0.6587          
Mcnemar's Test P-Value : 0.2482          

        Sensitivity : 0.7857          
        Specificity : 1.0000          
     Pos Pred Value : 1.0000          
     Neg Pred Value : 0.6250          
         Prevalence : 0.7368          
     Detection Rate : 0.5789          
   Detection Prevalence : 0.5789          
   Balanced Accuracy : 0.8929          

   'Positive' Class : Cancer

这是我的代码：

library(pROC)
    testData_class = c(rep(c("Normal", "Cancer"), c(5, 14)))
    prediction_svm_linear = data.frame(Cancer = c(0.11766249, 0.04765463, 0.08749940, 0.01715765, 0.10755376, 0.28358435, 0.37478957, 0.90603193, 0.91077112, 0.68602820, 0.64783894, 0.67916187,0.38785763, 0.66440580, 0.51897036, 0.93484214, 0.91719866, 0.83239007, 0.63491027), Normal = c(0.88233751, 0.95234537, 0.91250060, 0.98284235, 0.89244624, 0.71641565, 0.62521043, 0.09396807, 0.08922888, 0.31397180, 0.35216106, 0.32083813,0.61214237, 0.33559420, 0.48102964, 0.06515786, 0.08280134, 0.16760993, 0.36508973))

    result.roc.model1 <-  roc(testData$class, prediction_svm_linear$Cancer, 
                            levels = rev(levels(testData$class)))


>result.roc.model1
Call:
roc.default(response = testData$class, predictor = prediction.prob.b5_svm_linear$Cancer,     levels = rev(levels(testData$class)))

Data: prediction.prob.b5_svm_linear$Cancer in 5 controls (testData$class Normal) < 14 cases (testData$class Cancer).
Area under the curve: 1

2 回答

0
对不起，我可能会困惑你，但这里是所有的信息

二元预处理：

prediction_svm = c("Normal", "Normal", "Normal", "Normal", "Normal", "Normal", "Normal", "Cancer", "Cancer", "Cancer", "Cancer", "Cancer", "Normal", "Cancer", "Cancer", "Cancer", "Cancer", "Cancer", "Cancer")

基本事实：

testData_class = c(rep(c("Normal", "Cancer"), c(5, 14)))

概率预测

prediction_svm_linear.prob = data.frame(Cancer = c(0.11766249, 0.04765463, 0.08749940, 0.01715765, 0.10755376, 0.28358435, 0.37478957, 0.90603193, 0.91077112, 0.68602820, 0.64783894, 0.67916187,0.38785763, 0.66440580, 0.51897036, 0.93484214, 0.91719866, 0.83239007, 0.63491027), Normal = c(0.88233751, 0.95234537, 0.91250060, 0.98284235, 0.89244624, 0.71641565, 0.62521043, 0.09396807, 0.08922888, 0.31397180, 0.35216106, 0.32083813,0.61214237, 0.33559420, 0.48102964, 0.06515786, 0.08280134, 0.16760993, 0.36508973))

我正在使用此命令构建混淆矩阵：

confusionMatrix(prediction_svm, testData$class)
```
library(pROC)
    result.roc.model1 <-  roc(testData$class, prediction_svm_linear.prob$Cancer, 
                            levels = rev(levels(testData$class)))


>result.roc.model1
Call:
roc.default(response = testData$class, predictor = prediction.prob.b5_svm_linear$Cancer,     levels = rev(levels(testData$class)))

Data: prediction.prob.b5_svm_linear$Cancer in 5 controls (testData$class Normal) < 14 cases (testData$class Cancer).
Area under the curve: 1


>result.coords.model1 <- coords(  result.roc.model1, "best", best.method="closest.topleft",ret=c("threshold", "accuracy"))

>result.coords.model1
```
阈值准确度
0.2006234 1.0000000
回复于 2024-05-04T19:05:21+08:00
1
从您的评论中，我怀疑您滥用 caret 包中的 confusionMatrix 函数 . 根据文档，第二个因素应该是“a factor of classes to be used as the true results”，而你的评论表明你正在传递 data.frame 连续预测 . 它不仅与所需格式不同，而且也应该是您的第一个参数 .

你应该使用这样的东西：
```
predictions <- ifelse(prediction_svm_linear$Cancer > 0.2, "Cancer", "Normal")
confusionMatrix(predictions, testData_class)
```
回复于 2024-05-04T19:05:21+08:00

奇怪的ROC曲线预测

2 回答

二元预处理：

基本事实：

概率预测

相关问题