关于使用张量流LSTM用时间序列数据预测下一个值的实际项目-Java 学习之路

最近我研究了一个问题来预测第二天的网络视图 . 我选择了RNN-LSTM模型但效果并不理想 .

我的原始数据大约是200天，每天有1440点（每天有1440分钟，每分钟有一个值）=> [200 * 1440, 1] .

在特征工程之后，我将1-feature（仅为Web视图值）扩展为 8-features （当天的索引（范围从0到1439），Web视图值，is_weekday（0,1），当天的哪个小时（范围）从0到23），哪一周（从0到6）等等） . So after this step, the data is like [200 * 1440, 8]

然后将每个特征标准化，使其属于[-1,1] .

然后使用lstm模型 . 模型拱门就像FNN-3 * LSTM-2 * FNN . rnn_hidden_unit是128.我尝试使用30个time_steps来预测下一个值 . 例如，t0，t1，...，t29-> t30和t1，t2，...，t30-> t31 .

第一个FNN层是将8个特征转换为128，以使其适合于lstm层 . [batch_size，time_step，n_features] - > [batch_size，time_step，rnn_hidden_unit]
LSTM层有3层（MultiRNNCell，lstm_depth = 3）
最后2 * FNN层是将128转换为64到1 .

last_cell_output = output_rnn[:, -1, :] output_layer1 = tf.contrib.layers.fully_connected(last_cell_output, num_outputs=64, activation_fn=None) pred = tf.contrib.layers.fully_connected(output_layer1, num_outputs=1, activation_fn=None)

损失函数：仅使用最后一个lstm单元来计算损失 .

loss = tf.reduce_mean(tf.square(tf.reshape(pred, [-1]) - tf.reshape(Y, [-1]))) + regularization_cost

结果如下图所示，蓝色是实数值，灰色区域由两个值组成：predict_value * 0.8和predict_value * 1.2 .

但是当我选择传统的统计方法，例如使用前5个工作日的加权平均值，如果下一个预测日是工作日或前2个周末，如果下一个预测日是周末 . 结果很好 .
enter image description here

So are there any methods or tricks to improve the lstm model result? Or my solution itself has some faults. 非常感谢！

lstm模型的一些重要超参数：

time_step=30 (I also tried 60, 120. And 30 and 60 are well)
rnn_hidden_unit=128
n_features=8
lstm_depth=3
ouput_keep_prob=0.7
l2_regularization_rate=0.001
batch_size=64

关于使用张量流LSTM用时间序列数据预测下一个值的实际项目

相关问题