首页 文章

多项式朴素贝叶斯softmax改变

提问于
浏览
1

在scikit learn中,我使用MultinomialNB对标记的文本数据进行多类分类 . 虽然预测我使用了多项式NB的“predict_proba”功能

clf=MultinomialNB()
print(clf.fit(X_train,Y_train))
clf.predict_proba(X_test[0])

结果我得到了每个类的概率值向量,其中加起来为1.我知道这是因为softmax交叉熵函数 .

array([[0.01245064,0.02346781,0.84694063,0.03238112,0.01833107,0.03103464,0.03539408]])

我的问题是,在预测我需要使用binary_cross_entropy时,我得到每个类的概率值向量,介于0和1之间,彼此独立 . 那么如何在scikit-learn中进行预测时更改功能?

1 回答

  • 1

    您可以使用以下方法为每个 class 设置日志可能性:

    _joint_log_likelihood(self, X):
            """Compute the unnormalized posterior log probability of X
            I.e. ``log P(c) + log P(x|c)`` for all rows x of X, as an array-like of
            shape [n_classes, n_samples].
            Input is passed to _joint_log_likelihood as-is by predict,
            predict_proba and predict_log_proba.
            """
    

    朴素贝叶斯predict_log_proba只需通过上面的函数归一化即可 .

    def predict_log_proba(self, X):
            """
            Return log-probability estimates for the test vector X.
            Parameters
            ----------
            X : array-like, shape = [n_samples, n_features]
            Returns
            -------
            C : array-like, shape = [n_samples, n_classes]
                Returns the log-probability of the samples for each class in
                the model. The columns correspond to the classes in sorted
                order, as they appear in the attribute `classes_`.
            """
            jll = self._joint_log_likelihood(X)
            # normalize by P(x) = P(f_1, ..., f_n)
            log_prob_x = logsumexp(jll, axis=1)
            return jll - np.atleast_2d(log_prob_x).T
    

相关问题