首页 文章

在SVM中使用带有卡方距离度量的RBF内核

提问于
浏览
0

如何实现 Headers 提到的任务 . 我们在RBF内核中是否有任何参数将距离度量设置为卡方距离度量 . 我可以在sk-learn库中看到chi2_kernel .

下面是我写的代码 .

import numpy as np
from sklearn import datasets
from sklearn import svm
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score, classification_report, confusion_matrix

from sklearn.preprocessing import Imputer
from numpy import genfromtxt
from sklearn.metrics.pairwise import chi2_kernel


file_csv = 'dermatology.data.csv'
dataset = genfromtxt(file_csv, delimiter=',')

imp = Imputer(missing_values='NaN', strategy='most_frequent', axis=1)
dataset = imp.fit_transform(dataset)

target = dataset[:, [34]].flatten()
data = dataset[:, range(0,34)]

X_train, X_test, y_train, y_test = train_test_split(data, target, test_size=0.3)

# TODO : willing to set chi-squared distance metric instead. How to do that ?
clf = svm.SVC(kernel='rbf', C=1)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)

print(f1_score(y_test, y_pred, average="macro"))
print(precision_score(y_test, y_pred, average="macro"))
print(recall_score(y_test, y_pred, average="macro"))

1 回答

  • 0

    你确定你想要 compose rbf和chi2吗? Chi2本身定义了一个有效的内核,你所要做的就是

    clf = svm.SVC(kernel=chi2_kernel, C=1)
    

    因为sklearn接受 functions 作为内核(但是这将需要O(N ^ 2)内存和时间) . 如果你想组合这两个,那就有点复杂了,你必须实现自己的内核才能做到这一点 . 对于更多控件(和其他内核),您也可以尝试pykernels,但是还没有支持编写 .

相关问题