我在e1071包中使用SVM进行二进制分类 . 我正在使用概率属性和SVM预测分类来比较结果 . 令我困惑的是,预测函数的预测分类(0或1)似乎与属性中列出的实际概率不一致 . 对于级别1的一些非常高的概率,SVM分类是级别0,并且对于级别1的一些低概率,SVM分类是级别1 .

这是一个示例代码和结果

svm_model <- svm(as.factor(CHURNED) ~ .
                  , scale = FALSE
                  , data = train
                  , cost = 1
                  , gamma = 0.1
                  , kernel = "radial"
                  , probability = TRUE

    )
 test$Pred_Class <- predict(svm_model, test, probability = TRUE)
 test$Pred_Prob <- attr(test$Pred_Class, "probabilities")[,1]

结果如下:(行的位置不同,以查看各种示例)

CHURNED:正在预测的响应变量

Pred_class:是SVM的预测类

Pred_Prob:是预测概率,SVM根据该概率进行分类?

CHURNED Pred_Class  Pred_Prob
    1   0   0.03968526    # --> makes sense
    1   0   0.03968526
    1   0   0.07033222
    1   0   0.11711195
    1   0   0.12477983
    1   0   0.12827296
    1   0   0.12829345
    1   0   0.12829345
    1   0   0.12829345
    1   0   0.12829444
    1   0   0.12829927
    1   0   0.12829927
    1   0   0.12831169
    1   0   0.12831169
    1   0   0.12831428
    1   1   0.13053475   # --> doesn't make sense. Prob is less than 0.5
    1   1   0.13053475
    1   1   0.13053475
    1   1   0.1305348
    1   1   0.1305348
    1   1   0.1305348
    1   1   0.1690807
    1   1   0.2206993
    1   1   0.2321171
    0   0   0.998289      # --> doesn't make sense. Prob is almost 1!
    0   0   0.9982887
    0   0   0.993133
    0   0   0.9898889
    1   0   0.9849951
    0   0   0.9849951
    1   0   0.546427
    0   0   0.5440994    # --> doesn't make sense. Prob is more than 0.5
    0   0   0.5437889
    1   0   0.5417848
    0   0   0.5284112
    0   0   0.5252177
    0   1   0.5180776   # --> makes sense but is not consistent with above example
    0   1   0.5180704
    1   1   0.5180436
    1   1   0.5180436
    0   1   0.518043

这个结果对我来说根本没有意义 . 预测的类别和预测的概率不匹配 . 我已经检查过以确保我从“概率”属性矩阵引用了正确的列:

test$Pred_Class
  [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 [98] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
attr(,"probabilities")
             1         0
6442 0.2369796 0.7630204
6443 0.2520246 0.7479754
6513 0.2322581 0.7677419
6801 0.2309437 0.7690563
6802 0.2244768 0.7755232
6954 0.2322450 0.7677550
6968 0.2537544 0.7462456
6989 0.2352477 0.7647523
7072 0.2322308 0.7677692
...
...
...

也许我错误地解释概率?