为什么这个CNN脚本没有正确预测？-Java 学习之路

我对Python和机器学习都很陌生，我正在开发我的第一个真正的图像识别项目 . 它基于this tutorial，它只有两个分类（猫或狗），并且有更多的数据 . 尽管如此，我并没有让我的多类脚本正确地预测它，但主要是如何解决脚本问题 . 该脚本无法正确预测 .

Examples of domino images in the training data

下面是脚本 . 数据/图像由7个文件夹组成，每个文件夹大约10-15个图像 . 图像是100x100px的不同多米诺骨牌，一个文件夹只是婴儿照片（主要是作为对照组，因为它们与多米诺骨牌照片非常不同）：

from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
from keras.models import model_from_json
import numpy
import os

# Initialising the CNN
classifier = Sequential()

# Step 1 - Convolution
classifier.add(Conv2D(32, (25, 25), input_shape = (100, 100, 3), activation = 'relu'))

# Step 2 - Pooling
classifier.add(MaxPooling2D(pool_size = (2, 2)))

# Adding a second convolutional layer
classifier.add(Conv2D(32, (25, 25), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))

# Step 3 - Flattening
classifier.add(Flatten())

# Step 4 - Full connection
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 7, activation = 'sigmoid')) # 7 units equals amount of output categories

# Compiling the CNN
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])


# Part 2 - Fitting the CNN to the images
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale = 1./255,
    shear_range = 0.2,
    zoom_range = 0.2,
    horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory('dataset/training_set',
    target_size = (100, 100),
    batch_size = 32,
    class_mode = 'categorical')
test_set = test_datagen.flow_from_directory('dataset/test_set',
    target_size = (100, 100),
    batch_size = 32,
    class_mode = 'categorical')
classifier.fit_generator(training_set,
    steps_per_epoch = 168,
    epochs = 35,
    validation_data = test_set,
    validation_steps = 3)
classifier.summary()

# serialize weights to HDF5
classifier.save_weights("dominoweights.h5")
print("Saved model to disk")

# Part 3 - Making new predictions
import numpy as np
from keras.preprocessing import image

path = 'dataset/prediction_images/' # Folder with my images
for filename in os.listdir(path):
  if "jpg" in filename:
    test_image = image.load_img(path + filename, target_size = (100, 100))
    test_image = image.img_to_array(test_image)
    test_image = np.expand_dims(test_image, axis = 0)
    result = classifier.predict(test_image)
    print result
    training_set.class_indices
    folder = training_set.class_indices.keys()[(result[0].argmax())] # Get the index of the highest predicted value
    if folder == '1':
      prediction = '1x3'
    elif folder == '2':
      prediction = '1x8'
    elif folder == '3':
      prediction = 'Baby'
    elif folder == '4':
      prediction = '5x7'
    elif folder == '5':
      prediction = 'Upside down'
    elif folder == '6':
      prediction = '2x3'   
    elif folder == '7':
      prediction = '0x0'
    else:
      prediction = 'Unknown'
    print "Prediction: " + filename + " seems to be " + prediction
  else:
    print "DSSTORE"
  print "\n"

Explanations:

训练数据：每个类别各约10-15张图像 . 总共有168个训练图像
测试数据：每个类别各3张图像
dataset/prediction_images/ 包含脚本将预测的大约10个不同的图像
result 通常输出 array([[0., 0., 1., 0., 0., 0., 0.]], dtype=float32)

My question(s)

我的主要问题是：你看到脚本有什么特别的错误吗？或者，如果脚本工作正常并且只是缺少使预测错误的数据？

子问题：

我是否正确理解了卷积层，有一个25x25px的窗口可以扫描图像 . 我尝试了"default" 3x3px，但结果相同？
卷积层中的数字32 . 它是指32位图像吗？
有2个卷积层是否正常？我真的不明白为什么需要它 .
整个部分：

classifier.fit_generator(training_set,
steps_per_epoch = 168,
epochs = 35,
validation_data = test_set,
validation_steps = 3)

困惑我据我所知， steps_per_epoch 应该是我所拥有的训练图像的数量 . 那是对的吗？ epochs 是CNN的迭代量吗？

我不明白为什么需要这段代码：

from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)

在我看来，它正在创建图像的副本/版本，放大它们，翻转它们等等 . 为什么需要它？

任何有关这方面的提示都会对我有所帮助！

1 回答

2
代码似乎没有任何明显的错误，但大小 (25,25) 的过滤器可能有点不好 .

有两种可能性：
- 训练指标很好，但测试指标很差：你的模型过度拟合（可能是因为数据很少）
- 训练指标不好：你的模型不够好
Subquestions:

1 - 是的，您正在使用窗口大小（25,25）的过滤器，这些过滤器沿着输入图像滑动 . 过滤器越大，它们就越不普遍 .

2 - 数字32表示您想要为此图层输出多少个“通道” . 当您的输入图像有3个通道，红色层，绿色层和蓝色层时，这些卷积层将产生32个不同的通道 . 每个 Channels 的含义取决于我们看不到的隐藏数学 .
- Channels 数完全独立于任何事物 .
- 唯一的限制是：输入通道为3，输出类为7 .
3 - 拥有“很多”卷积层，一个在另一个层面上是正常的 . 一些众所周知的模型具有超过10个卷积层 .
- 为什么需要它？每个卷积层都在解释前一层的结果，并产生新的结果 . 它对模型更有影响力 . 一个可能太少了 .
4 - 发电机生产环境形状为 (batch_size,image_side1, image_side2, channels) 的批次 .
- steps_per_epoch 是必要的，因为使用的发电机是无限的（因此keras不知道何时停止）
- 通常，人们使用 steps_per_epoch = total_images//batch_size ，因此一个纪元将使用所有图像 . 但是你可以随心所欲地玩这些数字
- 通常，一个纪元是整个数据集的一次迭代 . （但是使用生成器和 steps_per_epoch ，这取决于用户）
5 - 除了从文件夹中加载数据并为您创建类之外，图像数据生成器也是 data augmentation 的工具 .
- 如果您的数据太少，您的模型将会过度拟合（优秀的列车结果，可怕的测试结果） .
- 机器学习需要大量数据才能正常运行
- 数据扩充是一种在没有足够数据时创建更多数据的方法
- 在模型的视野中，移动，翻转，拉长等图像是全新的
- 例如，模型可以学习向右看的猫，但不能学习向左看的猫
回复于 2024-05-06T18:27:13+08:00

为什么这个CNN脚本没有正确预测？

1 回答

相关问题