简介

我有一个基于张量流的正常CNN网络，我的目标是训练它，然后用它将图像分为两类 .

关于火车数据集

X：图像（ Health ，不 Health ），128 * 128

标签：[1,0]（不 Health ）或[0,1]（ Health ）

我使用TFrecords制作数据集 .

关于CNN模型

def weight_variable(shape):

    initial = tf.truncated_normal(shape, stddev = 0.1, dtype = tf.float32)
    return tf.Variable(initial)


def bias_variable(shape):

    initial = tf.constant(0.1, shape = shape, dtype = tf.float32)
    return tf.Variable(initial)


def conv2d(x, W):

    #(input, filter, strides, padding)
    #[batch, height, width, in_channels]
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')


def max_pool_2x2(x):

    #(value, ksize, strides, padding)
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

def cnn_model():

    epochs = 1
    batch_size = 200
    learning_rate = 0.001
    hidden = 1024
    cap_c = 498
    cap_h = 478
    num = cap_c + cap_h # the sum number of the training x
    image_size = 128
    label_size = 2
    ex = 2

    #train_loss = np.empty((num//(batch_size * ex)) * epochs)
    #train_acc = np.empty((num//(batch_size * ex)) * epochs)

    x = tf.placeholder(tf.float32, shape = [None, image_size * image_size])
    y = tf.placeholder(tf.float32, shape = [None, label_size])

    X_train_ = tf.reshape(x, [-1, image_size, image_size, 1])

    #First layer
    W_conv1 = weight_variable([5, 5, 1, 32])
    b_conv1 = bias_variable([32])

    h_conv1 = tf.nn.relu(conv2d(X_train_, W_conv1) + b_conv1)
    h_pool1 = max_pool_2x2(h_conv1)

    #Second layer
    W_conv2 = weight_variable([5, 5, 32, 64])
    b_conv2 = bias_variable([64])

    h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
    h_pool2 = max_pool_2x2(h_conv2)

    #Third layer
    #W_conv3 = weight_variable([5, 5, 64, 128])
    #b_conv3 = bias_variable([128])

    #h_conv3 = tf.nn.relu(conv2d(h_pool2, W_conv3) + b_conv3)
    #h_pool3 = max_pool_2x2(h_conv3)

    #Full connect layer
    W_fc1 = weight_variable([64 * 64 * 32, hidden])
    b_fc1 = bias_variable([hidden])

    h_pool2_flat = tf.reshape(h_pool2, [-1, 64 * 64 * 32])
    h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

    #Output_Softmax

    W_fc2 = weight_variable([hidden, label_size])
    b_fc2 = bias_variable([label_size])

    y_conv = tf.nn.softmax(tf.matmul(h_fc1, W_fc2) + b_fc2)

    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels = y, logits = y_conv))
    optimize = tf.train.AdamOptimizer(learning_rate).minimize(loss)
    correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y, 1)) 
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

然后是数据读取或sess部分 .

关于形状

作为占位符的形状，如果批量大小为200

X形：[200,128 * 128]

标签形状：[200,2]

输出形状：[200,2]

关于输出结果

我认为预测值应该训练为[1,0]或[0,1]，但是在大约5步之后，预测值都是[1,0]或[0,1] . 例如，如果批次大小为5，则结果为

[[1, 0],
[1, 0],
[1, 0],
[1, 0],
[1, 0]]

或完全相反 . 但是，有时候结果会有所不同，就像这样

[[1, 0],
[0, 1],
[1, 0],
[0, 1],
[1, 0]]

但这只持续了大约5个步骤，那么结果将完全相同 .

关于损失和准确性

由于预测结果不正确，损失不会收敛 . 换句话说，损失和准确性完全取决于训练数据集的X，这是无意义的 .

我的想法

我认为数据集TFrecords没有问题，因为我打印了图像矩阵和标签，它们都没问题 . 所以我认为问题在于模型 .

我没有得到答案，可以解决我的问题和谷歌搜索中的问题和SO中的其他问题，真的，谢谢你，如果你可以帮我这个 . 如果您需要更多结果或代码供参考，请与我们联系 .

1 回答

0
我认为你的数据可能是不 balancer 的，即训练样本的数量并不是两个类别 . 在您的示例中，您可能拥有比不 Health 目标更 Health 的目标 . 在这种情况下，通过将所有样本分类到同一类中来显着减少损失函数，但在此之后，错误分类的样本在一段时间后不太可能被再次正确分类 .

您可以尝试重新采样数据，以便为两个类获得大致相等的数字 .

另一种方法是使用加权交叉熵（例如，您可以计算每个样本的交叉熵，并将其乘以权重（确切地说，每个样本的权重张量）;只有在此之后才应用 tf.reduce_mean 例如，您可以对包含较少样本的类应用较大的权重，从而迫使优化器更加关注这些样本 .

这应该是这样的：
```
weights = tf.placeholder(tf.float32, shape=[None])
loss = tf.reduce_mean(tf.multiply(tf.nn.softmax_cross_entropy_with_logits(labels = y, logits = y_conv), weights))
```
当然，您需要在某些时候用值填充 weights .
回复于 2024-05-13T21:12:12+08:00

使用CNN进行图像二分类，但它总是将所有内容预测为一个类

简介