首页 文章

在scikit学习中训练具有不同特征维度的逻辑回归模型

提问于
浏览
0

在Windows上使用Python 2.7 . 想要使用特征 T1T2 来拟合逻辑回归模型以获得分类问题,并且目标是 T3 .

我显示 T1T2 的值,以及我的代码 . 问题是,由于 T1 的维度为5, T2 的维度为1,我们应该如何对它们进行预处理,以便scikit-learn逻辑回归训练可以正确利用它?

BTW,我的意思是训练样本1, T1 的特征是 [ 0 -1 -2 -3]T2 的特征是 [0] ,训练样本2,T1的特征是 [ 1 0 -1 -2]T2 的特征是 [1] ,...

import numpy as np
from sklearn import linear_model, datasets

arc = lambda r,c: r-c
T1 = np.array([[arc(r,c) for c in xrange(4)] for r in xrange(5)])
print T1
print type(T1)
T2 = np.array([[arc(r,c) for c in xrange(1)] for r in xrange(5)])
print T2
print type(T2)
T3 = np.array([0,0,1,1,1])

logreg = linear_model.LogisticRegression(C=1e5)

# we create an instance of Neighbours Classifier and fit the data.
# using T1 and T2 as features, and T3 as target
logreg.fit(T1+T2, T3)

T1,

[[ 0 -1 -2 -3]
 [ 1  0 -1 -2]
 [ 2  1  0 -1]
 [ 3  2  1  0]
 [ 4  3  2  1]]

T2,

[[0]
 [1]
 [2]
 [3]
 [4]]

1 回答

  • 1

    它需要使用numpy.concatenate连接特征数据矩阵 .

    import numpy as np
    from sklearn import linear_model, datasets
    
    arc = lambda r,c: r-c
    T1 = np.array([[arc(r,c) for c in xrange(4)] for r in xrange(5)])
    T2 = np.array([[arc(r,c) for c in xrange(1)] for r in xrange(5)])
    T3 = np.array([0,0,1,1,1])
    
    X = np.concatenate((T1,T2), axis=1)
    Y = T3
    logreg = linear_model.LogisticRegression(C=1e5)
    
    # we create an instance of Neighbours Classifier and fit the data.
    # using T1 and T2 as features, and T3 as target
    logreg.fit(X, Y)
    
    X_test = np.array([[1, 0, -1, -1, 1],
                       [0, 1, 2, 3, 4,]])
    
    print logreg.predict(X_test)
    

相关问题