首页 文章

如何通过TensorFlow加载MNIST(包括下载)?

提问于
浏览
0

MNIST的TensorFlow文档推荐了多种不同的加载MNIST数据集的方法:

All ways described in the documentation throw many deprecated warnings with TensorFlow 1.8.

我正在加载MNIST并创建培训批次的方式:

class MNIST:
    def __init__(self, optimizer):
        ...
        self.mnist_dataset = input_data.read_data_sets("/tmp/data/", one_hot=True)
        self.test_data = self.mnist_dataset.test.images.reshape((-1, self.timesteps, self.num_input))
        self.test_label = self.mnist_dataset.test.labels
        ...

    def train_run(self, sess):
        batch_input, batch_output = self.mnist_dataset.train.next_batch(self.batch_size, shuffle=True)
        batch_input = batch_input.reshape((self.batch_size, self.timesteps, self.num_input))
        _, loss = sess.run(fetches=[self.train_step, self.loss], feed_dict={self.input_placeholder: batch_input, self.output_placeholder: batch_output})
        ...

    def test_run(self, sess):
        loss = sess.run(fetches=[self.loss], feed_dict={self.input_placeholder: self.test_data, self.output_placeholder: self.test_label})
        ...

How could I do exactly the same thing, just with the current method of doing this?

我找不到任何关于此的文件 .

在我看来,新的方式是:

train, test = tf.keras.datasets.mnist.load_data()
self.mnist_train_ds = tf.data.Dataset.from_tensor_slices(train)
self.mnist_test_ds = tf.data.Dataset.from_tensor_slices(test)

但是如何在 train_runtest_run 方法中使用这些数据集?

1 回答

  • 1

    使用 TF dataset API 加载MNIST数据集的示例:


    Create a mnist dataset to load train, valid and test images:

    您可以使用 Dataset.from_tensor_slicesDataset.from_generator 为numpy输入创建 dataset . Dataset.from_tensor_slices 将整个数据集添加到计算图中,因此我们将使用 Dataset.from_generator .

    #load mnist data
    (x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
    
    def create_mnist_dataset(data, labels, batch_size):
      def gen():
        for image, label in zip(data, labels):
            yield image, label
      ds = tf.data.Dataset.from_generator(gen, (tf.float32, tf.int32), ((28,28 ), ()))
    
      return ds.repeat().batch(batch_size)
    
    #train and validation dataset with different batch size
    train_dataset = create_mnist_dataset(x_train, y_train, 10)
    valid_dataset = create_mnist_dataset(x_test, y_test, 20)
    

    A feedable iterator that can toggle between training and validation

    handle = tf.placeholder(tf.string, shape=[])
    iterator = tf.data.Iterator.from_string_handle(
    handle, train_dataset.output_types, train_dataset.output_shapes)
    image, label = iterator.get_next()
    
    train_iterator = train_dataset.make_one_shot_iterator()
    valid_iterator = valid_dataset.make_one_shot_iterator()
    

    A sample run:

    #A toy network
    y = tf.layers.dense(tf.layers.flatten(image),1,activation=tf.nn.relu)
    loss = tf.losses.mean_squared_error(tf.squeeze(y), label)
    
    with tf.Session() as sess:
       sess.run(tf.global_variables_initializer())
    
       # The `Iterator.string_handle()` method returns a tensor that can be evaluated
       # and used to feed the `handle` placeholder.
       train_handle = sess.run(train_iterator.string_handle())
       valid_handle = sess.run(valid_iterator.string_handle())
    
       # Run training
       train_loss, train_img, train_label = sess.run([loss, image, label],
                                                     feed_dict={handle: train_handle})
       # train_image.shape = (10, 784) 
    
       # Run validation
       valid_pred, valid_img = sess.run([y, image], 
                                        feed_dict={handle: valid_handle})
       #test_image.shape = (20, 784)
    

相关问题