首页 文章

nolearn用于多标签分类

提问于
浏览
2

我尝试使用从nolearn包导入的DBN函数,这是我的代码:

from nolearn.dbn import DBN
import numpy as np
from sklearn import cross_validation

fileName = 'data.csv'
fileName_1 = 'label.csv'

data = np.genfromtxt(fileName, dtype=float, delimiter = ',')
label = np.genfromtxt(fileName_1, dtype=int, delimiter = ',')

clf = DBN(
    [data, 300, 10],
    learn_rates=0.3,
    learn_rate_decays=0.9,
    epochs=10,
    verbose=1,
    )

clf.fit(data,label)
score = cross_validation.cross_val_score(clf, data, label,scoring='f1', cv=10)
print score

由于我的数据具有形状(1231,229)和带有形状(1231,13)的标签,因此标签集看起来像([0 0 1 0 1 0 1 0 0 0 1 1 0] ...,[.. ..]),当我运行我的代码时,我得到了这个错误信息:输入形状不好(1231,13) . 我想知道这里可能发生两个问题:

  • DBN不支持多标签分类

  • 我的标签不适合用于DBN适配功能 .

2 回答

  • 5

    正如Francisco Vargas所述, nolearn.dbn 已被弃用,您应该使用 nolearn.lasagne (如果可以的话) .

    如果要在千层面中进行多标签分类,则应将 regression 参数设置为 True ,定义验证分数和自定义丢失 .

    这是一个例子:

    import numpy as np
    import theano.tensor as T
    from lasagne import layers
    from lasagne.updates import nesterov_momentum
    from nolearn.lasagne import NeuralNet
    from nolearn.lasagne import BatchIterator
    from lasagne import nonlinearities
    
    # custom loss: multi label cross entropy
    def multilabel_objective(predictions, targets):
        epsilon = np.float32(1.0e-6)
        one = np.float32(1.0)
        pred = T.clip(predictions, epsilon, one - epsilon)
        return -T.sum(targets * T.log(pred) + (one - targets) * T.log(one - pred), axis=1)
    
    
    net = NeuralNet(
        # customize "layers" to represent the architecture you want
        # here I took a dummy architecture
        layers=[(layers.InputLayer, {"name": 'input', 'shape': (None, 1, 229, 1)}),
    
                (layers.DenseLayer, {"name": 'hidden1', 'num_units': 20}),
                (layers.DenseLayer, {"name": 'output', 'nonlinearity': nonlinearities.sigmoid, 'num_units': 13})], #because you have 13 outputs
    
        # optimization method:
        update=nesterov_momentum,
        update_learning_rate=5*10**(-3),
        update_momentum=0.9,
    
        max_epochs=500,  # we want to train this many epochs
        verbose=1,
    
        #Here are the important parameters for multi labels
        regression=True,  
    
        objective_loss_function=multilabel_objective,
        custom_score=("validation score", lambda x, y: np.mean(np.abs(x - y)))
    
        )
    
    net.fit(X_train, labels_train)
    
  • 0

    适合调用BuildDBN,可以在这里找到here一个重要的事情需要注意的是dbn已被弃用,你只能找到它old_commits . 无论如何,如果你正在寻找额外的信息可能很好,从我在你的代码片段中看到的那两个检查是 DBN 的第一个参数 [data, 300, 10] 应该是 [data.shape[1], 300, 10] 基于文档和源代码 . 希望这可以帮助 .

相关问题