首页 文章

在训练的张量流网络中获得所有输入的相同预测值

提问于
浏览
8

我创建了一个张量流网络,用于从该数据集中读取数据(注意:此数据集中的信息纯粹是出于测试目的而设计的,并非真实的):
enter image description here
并且我正在尝试构建一个张量流网络,旨在基本上预测'Exited'中的值柱 . 我的网络构造为采用11个输入,通过relu激活通过2个隐藏层(每个6个神经元),并使用sigmoid激活函数输出单个二进制值,以产生概率分布 . 我正在使用梯度下降优化器和均方误差成本函数 . 但是,在我的训练数据训练网络并预测我的测试数据之后,我的所有预测值都大于0.5意味着可能是真的,我不确定问题是什么:

X_train, X_test, y_train, y_test = train_test_split(X_data, y_data, test_size=0.2, random_state=101)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.fit_transform(X_test)

training_epochs = 200
n_input = 11
n_hidden_1 = 6
n_hidden_2 = 6
n_output = 1

def neuralNetwork(x, weights):
     layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
     layer_1 = tf.nn.relu(layer_1)
     layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
     layer_2 = tf.nn.relu(layer_2)
     output_layer = tf.add(tf.matmul(layer_2, weights['output']), biases['output'])
     output_layer = tf.nn.sigmoid(output_layer)
     return output_layer

weights = {
    'h1': tf.Variable(tf.random_uniform([n_input, n_hidden_1])),
    'h2': tf.Variable(tf.random_uniform([n_hidden_1, n_hidden_2])),
    'output': tf.Variable(tf.random_uniform([n_hidden_2, n_output]))
}

biases = {
    'b1': tf.Variable(tf.random_uniform([n_hidden_1])),
    'b2': tf.Variable(tf.random_uniform([n_hidden_2])),
    'output': tf.Variable(tf.random_uniform([n_output]))
}

x = tf.placeholder('float', [None, n_input]) # [?, 11]
y = tf.placeholder('float', [None, n_output]) # [?, 1]

output = neuralNetwork(x, weights)
cost = tf.reduce_mean(tf.square(output - y))
optimizer = tf.train.AdamOptimizer().minimize(cost)

with tf.Session() as session:
    session.run(tf.global_variables_initializer())
    for epoch in range(training_epochs):
        session.run(optimizer, feed_dict={x:X_train, y:y_train.reshape((-1,1))})
    print('Model has completed training.')
    test = session.run(output, feed_dict={x:X_test})
    predictions = (test>0.5).astype(int)
    print(predictions)

所有帮助表示赞赏!我一直在查看与我的问题有关的问题,但没有一个建议似乎有帮助 .

1 回答

  • 4

    Initial assumption: 我赢得了't access data from a personal link for security reasons. It'您有责任仅基于安全/持久性工件创建可重现的代码片段 .
    但是,我可以确认您的代码是针对 keras.datasets.mnist 运行时发生的,只需稍加更改:每个示例都与标签 0: odd1: even 相关联 .

    Short answer: 你搞砸了初始化 . 将 tf.random_uniform 更改为 tf.random_normal 并将偏差设置为确定性 0 .

    Actual answer: 理想情况下,您希望模型随机开始预测,接近 0.5 . 这将防止S形输出的饱和,并在训练的早期阶段产生大的梯度 .

    sigmoid的eq . 是 s(y) = 1/(1 + e**-y)s(y) = 0.5 <=> y = 0 . 因此,图层的输出 y = w * x + b 必须为 0 .

    如果您使用 StandardScaler ,那么您的输入数据遵循高斯分布,均值= 0.5,std = 1.0 . 您的参数必须支持此分发!但是,您已使用 tf.random_uniform 初始化偏差,它从 [0, 1) 间隔统一绘制值 .

    通过 0 开始偏见, y 将接近 0

    y = w * x + b = sum(.1 * -1, .9 * -.9, ..., .1 * 1, .9 * .9) + 0 = 0
    

    所以你的偏见应该是:

    biases = {
        'b1': tf.Variable(tf.zeros([n_hidden_1])),
        'b2': tf.Variable(tf.zeros([n_hidden_2])),
        'output': tf.Variable(tf.zeros([n_output]))
    }
    

    这足以输出小于 0.5 的数字:

    [1.        0.4492423 0.4492423 ... 0.4492423 0.4492423 1.       ]
    predictions mean: 0.7023628
    confusion matrix:
    [[4370 1727]
     [1932 3971]]
    accuracy: 0.6950833333333334
    

    Further corrections:

    • 您的 neuralNetwork 函数不带 biases 参数 . 它改为使用另一个范围中定义的那个,这似乎是一个错误 .

    • 您不应该将缩放器与测试数据相匹配,因为您将丢失列车中的统计数据,因为它违反了该数据块纯粹是观察的原则 . 做这个:

    scaler = StandardScaler()
    x_train = scaler.fit_transform(x_train)
    x_test = scaler.transform(x_test)
    
    • 使用带有sigmoid输出的MSE非常罕见 . 使用二进制交叉熵代替:
    logits = tf.add(tf.matmul(layer_2, weights['output']), biases['output'])
    output = tf.nn.sigmoid(logits)
    cost = tf.nn.sigmoid_cross_entropy_with_logits(labels=y, logits=logits)
    
    • 从正态分布初始化权重更可靠:
    weights = {
        'h1': tf.Variable(tf.random_uniform([n_input, n_hidden_1])),
        'h2': tf.Variable(tf.random_uniform([n_hidden_1, n_hidden_2])),
        'output': tf.Variable(tf.random_uniform([n_hidden_2, n_output]))
    }
    
    • 您正在为每个纪元提供整个火车数据集,而不是对其进行批处理,这是Keras的默认设置 . 因此,假设Keras实现更快收敛并且结果可能不同是合理的 .

    通过制作一些柚木,我设法达到了这个结果:

    import tensorflow as tf
    from keras.datasets.mnist import load_data
    from sacred import Experiment
    from sklearn.metrics import accuracy_score, confusion_matrix
    from sklearn.model_selection import train_test_split
    from sklearn.preprocessing import StandardScaler
    
    ex = Experiment('test-16')
    
    
    @ex.config
    def my_config():
        training_epochs = 200
        n_input = 784
        n_hidden_1 = 32
        n_hidden_2 = 32
        n_output = 1
    
    
    def neuralNetwork(x, weights, biases):
        layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
        layer_1 = tf.nn.relu(layer_1)
        layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
        layer_2 = tf.nn.relu(layer_2)
        logits = tf.add(tf.matmul(layer_2, weights['output']), biases['output'])
        predictions = tf.nn.sigmoid(logits)
        return logits, predictions
    
    
    @ex.automain
    def main(training_epochs, n_input, n_hidden_1, n_hidden_2, n_output):
        (x_train, y_train), _ = load_data()
        x_train = x_train.reshape(x_train.shape[0], -1).astype(float)
        y_train = (y_train % 2 == 0).reshape(-1, 1).astype(float)
    
        x_train, x_test, y_train, y_test = train_test_split(x_train, y_train, test_size=0.2, random_state=101)
        print('y samples:', y_train, y_test, sep='\n')
    
        scaler = StandardScaler()
        x_train = scaler.fit_transform(x_train)
        x_test = scaler.transform(x_test)
    
        weights = {
            'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),
            'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
            'output': tf.Variable(tf.random_normal([n_hidden_2, n_output]))
        }
    
        biases = {
            'b1': tf.Variable(tf.zeros([n_hidden_1])),
            'b2': tf.Variable(tf.zeros([n_hidden_2])),
            'output': tf.Variable(tf.zeros([n_output]))
        }
    
        x = tf.placeholder('float', [None, n_input])  # [?, 11]
        y = tf.placeholder('float', [None, n_output])  # [?, 1]
    
        logits, output = neuralNetwork(x, weights, biases)
        # cost = tf.reduce_mean(tf.square(output - y))
        cost = tf.nn.sigmoid_cross_entropy_with_logits(labels=y, logits=logits)
        optimizer = tf.train.AdamOptimizer().minimize(cost)
    
        with tf.Session() as session:
            session.run(tf.global_variables_initializer())
            try:
                for epoch in range(training_epochs):
                    print('epoch #%i' % epoch)
                    session.run(optimizer, feed_dict={x: x_train, y: y_train})
    
            except KeyboardInterrupt:
                print('interrupted')
    
            print('Model has completed training.')
            p = session.run(output, feed_dict={x: x_test})
            p_labels = (p > 0.5).astype(int)
    
            print(p.ravel())
            print('predictions mean:', p.mean())
    
            print('confusion matrix:', confusion_matrix(y_test, p_labels), sep='\n')
            print('accuracy:', accuracy_score(y_test, p_labels))
    
    [0.        1.        0.        ... 0.0302309 0.        1.       ]
    predictions mean: 0.48261687
    confusion matrix:
    [[5212  885]
     [ 994 4909]]
    accuracy: 0.8434166666666667
    

相关问题