首页 文章

如何使用Tensorboard在同一图表上绘制不同的汇总指标?

提问于
浏览
0

我希望能够在Tensorboard的同一个绘图上绘制 training loss per batchaverage validation loss 作为验证集 . 当我的验证集太大而无法放入内存时,我遇到了这个问题,因此需要批处理并使用 tf.metrics update ops .

此问题可能适用于您希望在Tensorboard中显示在同一图表上的任何Tensorflow指标 .

我能够

  • 分别绘制这两个图表(见here

  • 在每个培训批次的培训损失相同的图表上绘制验证 - 丢失 - 验证 - 批次(当验证集可以是单个批次时,这是可以的,我可以重用下面的培训摘要op train_summ

在下面的示例代码中,我的问题源于这样的事实:我的验证摘要 tf.summary.scalarname=loss 被重命名为 loss_1 ,因此被移动到Tensorboard中的单独图形 . 从我可以解决的问题来看,Tensorboard需要"same name"并将它们绘制在同一个图表上,无论它们位于什么文件夹中 . 这是令人沮丧的,因为 train_summ (名称=丢失)只写入 train 文件夹和 valid_summ (名称=丢失)只会写入 valid 文件夹 - 但仍会重命名为 loss_1 .

示例代码:

# View graphs with (Linux): $ tensorboard --logdir=/tmp/my_tf_model

import tensorflow as tf
import numpy as np
import os
import tempfile

def train_data_gen():
    yield np.random.normal(size=[3]), np.array([0.5, 0.5, 0.5])

def valid_data_gen():
    yield np.random.normal(size=[3]), np.array([0.8, 0.8, 0.8])

batch_size = 25
n_training_batches = 4
n_valid_batches = 2
n_epochs = 5
summary_loc = os.path.join(tempfile.gettempdir(), 'my_tf_model')
print("Summaries written to" + summary_loc)

# Dummy data
train_data = tf.data.Dataset.from_generator(train_data_gen, (tf.float32, tf.float32)).repeat().batch(batch_size)
valid_data = tf.data.Dataset.from_generator(valid_data_gen, (tf.float32, tf.float32)).repeat().batch(batch_size)
handle = tf.placeholder(tf.string, shape=[])
iterator = tf.data.Iterator.from_string_handle(handle, 
train_data.output_types, train_data.output_shapes)
batch_x, batch_y = iterator.get_next()
train_iter = train_data.make_initializable_iterator()
valid_iter = valid_data.make_initializable_iterator()

# Some ops on the data
loss = tf.losses.mean_squared_error(batch_x, batch_y)
valid_loss, valid_loss_update = tf.metrics.mean(loss)

# Write to summaries
train_summ = tf.summary.scalar('loss', loss)
valid_summ = tf.summary.scalar('loss', valid_loss)  # <- will be renamed to "loss_1"

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    train_handle, valid_handle = sess.run([train_iter.string_handle(), valid_iter.string_handle()])
    sess.run([train_iter.initializer, valid_iter.initializer])

    # Summary writers
    writer_train = tf.summary.FileWriter(os.path.join(summary_loc, 'train'), sess.graph)
    writer_valid = tf.summary.FileWriter(os.path.join(summary_loc, 'valid'), sess.graph)

    global_step = 0  # implicit as no actual training
    for i in range(n_epochs):
        # "Training"
        for j in range(n_training_batches):
            global_step += 1
            summ = sess.run(train_summ, feed_dict={handle: train_handle})
            writer_train.add_summary(summary=summ, global_step=global_step)
        # "Validation"
        sess.run(tf.local_variables_initializer())
        for j in range(n_valid_batches):
             _, batch_summ = sess.run([valid_loss_update, train_summ], feed_dict={handle: valid_handle})
            # The following will plot the batch loss for the validation set on the loss plot with the training data:
            # writer_valid.add_summary(summary=batch_summ, global_step=global_step + j + 1)
        summ = sess.run(valid_summ)
        writer_valid.add_summary(summary=summ, global_step=global_step)  # <- I want this on the training loss graph

我尝试过的

  • 按照this issuethis question的建议分开 tf.summary.FileWriter 个对象(一个用于培训,一个用于验证)(想想我所追求的是在该问题的评论中提到的)

  • 使用 tf.summary.merge 将我的所有培训和验证/测试指标合并到总体摘要操作中;有用的簿记,但没有在同一个图表上绘制我想要的东西

  • 使用 tf.summary.scalar family 属性( loss 仍然被重命名为 loss_1

  • (Complete hack solution) 在训练数据上使用 valid_loss, valid_loss_update = tf.metrics.mean(loss) ,然后在每个训练批次中运行 tf.local_variables_initializer() . 这确实给你相同的摘要操作,因此把事情放在同一个图表上,但肯定不是你打算如何做到这一点?它也没有推广到其他指标 .

上下文

  • Tensorflow 1.9.0

  • Tensorboard 1.9.0

  • Python 3.5.2

1 回答

  • 1

    Tensorboard custom_scalar plugin是解决此问题的方法 .

    以下是同样的示例,使用 custom_scalar 绘制同一图表中的两个损失(每个训练批次在所有验证批次上的平均值):

    # View graphs with (Linux): $ tensorboard --logdir=/tmp/my_tf_model
    
    import os
    import tempfile
    import tensorflow as tf
    import numpy as np
    from tensorboard import summary as summary_lib
    from tensorboard.plugins.custom_scalar import layout_pb2
    
    def train_data_gen():
        yield np.random.normal(size=[3]), np.array([0.5, 0.5, 0.5])
    
    def valid_data_gen():
        yield np.random.normal(size=[3]), np.array([0.8, 0.8, 0.8])
    
    batch_size = 25
    n_training_batches = 4
    n_valid_batches = 2
    n_epochs = 5
    summary_loc = os.path.join(tempfile.gettempdir(), 'my_tf_model')
    print("Summaries written to " + summary_loc)
    
    # Dummy data
    train_data = tf.data.Dataset.from_generator(
        train_data_gen, (tf.float32, tf.float32)).repeat().batch(batch_size)
    valid_data = tf.data.Dataset.from_generator(
        valid_data_gen, (tf.float32, tf.float32)).repeat().batch(batch_size)
    handle = tf.placeholder(tf.string, shape=[])
    iterator = tf.data.Iterator.from_string_handle(handle, train_data.output_types,
                                                   train_data.output_shapes)
    batch_x, batch_y = iterator.get_next()
    train_iter = train_data.make_initializable_iterator()
    valid_iter = valid_data.make_initializable_iterator()
    
    # Some ops on the data
    loss = tf.losses.mean_squared_error(batch_x, batch_y)
    valid_loss, valid_loss_update = tf.metrics.mean(loss)
    
    with tf.name_scope('loss'):
        train_summ = summary_lib.scalar('training', loss)
        valid_summ = summary_lib.scalar('valid', valid_loss)
    
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        train_handle, valid_handle = sess.run([train_iter.string_handle(), valid_iter.string_handle()])
        sess.run([train_iter.initializer, valid_iter.initializer])
    
        writer_train = tf.summary.FileWriter(os.path.join(summary_loc, 'train'), sess.graph)
        writer_valid = tf.summary.FileWriter(os.path.join(summary_loc, 'valid'), sess.graph)
    
        layout_summary = summary_lib.custom_scalar_pb(
            layout_pb2.Layout(category=[
                layout_pb2.Category(
                    title='losses',
                    chart=[
                        layout_pb2.Chart(
                            title='losses',
                            multiline=layout_pb2.MultilineChartContent(tag=[
                                'loss/training', 'loss/valid'
                            ]))
                    ])
            ]))
        writer_train.add_summary(layout_summary)
    
        global_step = 0
        for i in range(n_epochs):
            for j in range(n_training_batches): # "Training"
                global_step += 1
                summ = sess.run(train_summ, feed_dict={handle: train_handle})
                writer_train.add_summary(summary=summ, global_step=global_step)
    
            sess.run(tf.local_variables_initializer())
            for j in range(n_valid_batches):  # "Validation"
                _, batch_summ = sess.run([valid_loss_update, train_summ], feed_dict={handle: valid_handle})
            summ = sess.run(valid_summ)
            writer_valid.add_summary(summary=summ, global_step=global_step)
    

    Here's the resulting output在Tensorboard中 .

相关问题