为什么用于预测的Keras LSTM批量大小必须与拟合批量大小相同？-Java 学习之路

当使用Keras LSTM预测时间序列数据时，当我尝试使用批量大小为50训练模型时，我一直在收到错误，然后尝试使用批量大小为1来预测相同模型（即只是预测下一个值） .

为什么我无法同时训练和匹配多个批次的模型，然后使用该模型预测除了相同批次大小之外的任何其他内容 . 它似乎没有意义，但随后我很容易就会遗漏一些这方面的东西 .

编辑：这是模型 . batch_size 是50， sl 是序列长度，当前设置为20 .

model = Sequential()
    model.add(LSTM(1, batch_input_shape=(batch_size, 1, sl), stateful=True))
    model.add(Dense(1))
    model.compile(loss='mean_squared_error', optimizer='adam')
    model.fit(trainX, trainY, epochs=epochs, batch_size=batch_size, verbose=2)

这是预测RMSE训练集的线

# make predictions
    trainPredict = model.predict(trainX, batch_size=batch_size)

这是对看不见的时间步骤的实际预测

for i in range(test_len):
    print('Prediction %s: ' % str(pred_count))

    next_pred_res = np.reshape(next_pred, (next_pred.shape[1], 1, next_pred.shape[0]))
    # make predictions
    forecastPredict = model.predict(next_pred_res, batch_size=1)
    forecastPredictInv = scaler.inverse_transform(forecastPredict)
    forecasts.append(forecastPredictInv)
    next_pred = next_pred[1:]
    next_pred = np.concatenate([next_pred, forecastPredict])

    pred_count += 1

这个问题与行：

forecastPredict = model.predict(next_pred_res, batch_size=batch_size)

batch_size此处设置为1时的错误是：

ValueError: Cannot feed value of shape (1, 1, 2) for Tensor 'lstm_1_input:0', which has shape '(10, 1, 2)' 与 batch_size 此处设置为50时抛出的错误相同，与其他批量大小一样 .

总误差是：

forecastPredict = model.predict(next_pred_res, batch_size=1)
  File "/home/entelechy/tf_keras/lib/python3.5/site-packages/keras/models.py", line 899, in predict
    return self.model.predict(x, batch_size=batch_size, verbose=verbose)
  File "/home/entelechy/tf_keras/lib/python3.5/site-packages/keras/engine/training.py", line 1573, in predict
    batch_size=batch_size, verbose=verbose)
   File "/home/entelechy/tf_keras/lib/python3.5/site-packages/keras/engine/training.py", line 1203, in _predict_loop
    batch_outs = f(ins_batch)
  File "/home/entelechy/tf_keras/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 2103, in __call__
    feed_dict=feed_dict)
  File "/home/entelechy/tf_keras/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 767, in run
    run_metadata_ptr)
  File "/home/entelechy/tf_keras/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 944, in _run
    % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (1, 1, 2) for Tensor 'lstm_1_input:0', which has shape '(10, 1, 2)'

编辑：一旦我将模型设置为 stateful=False ，我就可以使用不同的批量大小进行拟合/训练和预测 . 这是什么原因？

6 回答

14

不幸的是，你想要做的事情对Keras来说是不可能的......我也在这个问题上花费了很多时间，唯一的方法是潜入兔子洞并与Tensorflow直接合作进行LSTM滚动预测 .

首先，要明确术语， batch_size 通常表示一起训练的序列数， num_steps 表示一起训练多少时间步 . 当你的意思是 batch_size=1 和"just predicting the next value"时，我认为你的意思是用 num_steps=1 进行预测 .

否则，应该可以训练和预测 batch_size=50 意味着您正在训练50个序列并且每个时间步进行50次预测，每个序列一个（意味着训练/预测 num_steps=1 ） .

但是，我认为你的意思是你想使用有状态LSTM训练 num_steps=50 并用 num_steps=1 进行预测 . 从理论上讲，这可以使感觉成为可能，并且Tensorflow可以实现，而不是Keras .

The problem ：Keras需要有状态RNN的显式批量大小 . 您必须指定batch_input_shape（batch_size，num_steps，features） .

The reason ：Keras必须在计算图中使用shape（batch_size，num_units）分配固定大小的隐藏状态向量，以便在训练批次之间保留值 . 另一方面，当 stateful=False 时，隐藏状态向量可以在每个批处理的开头用零动态初始化，因此它不需要是固定大小 . 更多细节：http://philipperemy.github.io/keras-stateful-lstm/

Possible work around ：用 num_steps=1 训练和预测 . 示例：https://github.com/keras-team/keras/blob/master/examples/lstm_stateful.py . 对于您的问题，这可能会或可能根本不起作用，因为反向传播的梯度将仅在一个时间步上计算 . 见：https://github.com/fchollet/keras/issues/3669

My solution: use Tensorflow ：在Tensorflow中，您可以使用 batch_size=50, num_steps=100 进行训练，然后使用 batch_size=1, num_steps=1 进行预测 . 这可以通过为训练和预测共享相同的RNN权重矩阵创建不同的模型图来实现 . 请参阅此示例以了解下一个字符预测：https://github.com/sherjilozair/char-rnn-tensorflow/blob/master/model.py#L11和博客文章http://karpathy.github.io/2015/05/21/rnn-effectiveness/ . 请注意，一个图仍然只能使用一个指定的 batch_size ，但您可以在Tensorflow中设置多个共享权重的模型图 .

回复于 2024-04-20T18:36:01+08:00
1
可悲的是，你所希望的是不可能的，因为你在定义模型时指定了batch_size ......但是，我找到了解决这个问题的简单方法：创建2个模型！第一个用于训练，第二个用于预测，并让它们共享权重：
```
train_model = Sequential([Input(batch_input_shape=(batch_size,...),
<continue specifying your model>])

predict_model = Sequential([Input(batch_input_shape=(1,...),
<continue specifying exact same model>])

train_model.compile(loss='sparse_categorical_crossentropy', optimizer=Adam())
predict_model.compile(loss='sparse_categorical_crossentropy', optimizer=Adam())
```
现在，您可以使用任何所需的批量大小 . 在你的train_model适合之后，只需保存它的权重并使用predict_model加载它们：
```
train_model.save_weights('lstm_model.h5')
predict_model.load_weights('lstm_model.h5')
```
注意你只想保存和加载权重，而不是整个模型（包括架构，优化器等......） . 这样你就可以获得权重，但你可以一次输入一批...更多关于keras保存/加载模型：https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model

请注意，您需要安装h5py才能使用“保存权重” .
回复于 2024-04-20T18:36:01+08:00

另一个简单的解决方法是：

def create_model(batch_size):
    model = Sequential()
    model.add(LSTM(1, batch_input_shape=(batch_size, 1, sl), stateful=True))
    model.add(Dense(1))
    return model

model_train = create_model(batch_size=50)

model_train.compile(loss='mean_squared_error', optimizer='adam')
model_train.fit(trainX, trainY, epochs=epochs, batch_size=batch_size)

model_predict = create_model(batch_size=1)

weights = model_train.get_weights()
model_predict.set_weights(weights)

回复于 2024-04-20T18:36:01+08:00

0
这个问题的最佳解决方案是“复制权重” . 如果您想使用不同批量的LSTM模型进行培训和预测，这将非常有用 .

例如，一旦您使用'n'批量大小训练模型，如下所示：
```
# configure network
n_batch = len(X)
n_epoch = 1000
n_neurons = 10
# design network
model = Sequential()
model.add(LSTM(n_neurons, batch_input_shape=(n_batch, X.shape[1], X.shape[2]), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
```
And now you want to want predict values fewer than your batch size where n=1.

你能做的就是复制拟合模型的权重，并重新初始化具有相同架构的新模型LSTM模型，并将批量大小设置为1 .
```
# re-define the batch size
n_batch = 1
# re-define model
new_model = Sequential()
new_model.add(LSTM(n_neurons, batch_input_shape=(n_batch, X.shape[1], X.shape[2]),       stateful=True))
new_model.add(Dense(1))
# copy weights
old_weights = model.get_weights()
new_model.set_weights(old_weights)
```
现在，您可以轻松预测和训练具有不同批量大小的LSTM .

欲了解更多信息，请阅读：https://machinelearningmastery.com/use-different-batch-sizes-training-predicting-python-keras/
回复于 2024-04-20T18:36:01+08:00

我发现下面有帮助（并完全符合上述内容） . “解决方案3：复制权重”部分为我工作：

How to use Different Batch Sizes when Training and Predicting with LSTMs, by Jason Brownlee

n_neurons = 10
# design network
model = Sequential()
model.add(LSTM(n_neurons, batch_input_shape=(n_batch, X.shape[1], X.shape[2]), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
# fit network
for i in range(n_epoch):
    model.fit(X, y, epochs=1, batch_size=n_batch, verbose=1, shuffle=False)
    model.reset_states()
# re-define the batch size
n_batch = 1
# re-define model
new_model = Sequential()
new_model.add(LSTM(n_neurons, batch_input_shape=(n_batch, X.shape[1], X.shape[2]), stateful=True))
new_model.add(Dense(1))
# copy weights
old_weights = model.get_weights()
new_model.set_weights(old_weights)
# compile model
new_model.compile(loss='mean_squared_error', optimizer='adam')

回复于 2024-04-20T18:36:01+08:00

3
我也有同样的问题并解决了它 .

换句话说，您可以保存权重，在测试结果时，可以使用相同的体系结构重新加载模型并设置 batch_size=1 ，如下所示：
```
n_neurons = 10
 # design network
 model = Sequential()
 model.add(LSTM(n_neurons, batch_size=1, batch_input_shape=(n_batch,X.shape[1], X.shape[2]), statefull=True))
 model.add(Dense(1))
 model.compile(loss='mean_squared_error', optimizer='adam')
 model.load_weights("w.h5")
```
它会运作良好 . 我希望它对你有所帮助 .
回复于 2024-04-20T18:36:01+08:00

为什么用于预测的Keras LSTM批量大小必须与拟合批量大小相同？

6 回答

相关问题