我正在运行分布式Tensorflow脚本 . 创建群集服务器时,我看到控制台中显示的信息如下所示:

E0805 20:51:03.294260965 3387 ev_epoll1_linux.c:1051] grpc epoll fd:3 2017-08-05 20:51:03.299766:I tensorflow / core / distributed_runtime / rpc / grpc_channel.cc:215]初始化GrpcChannelCache for job ps - > {0 - > localhost:2222} 2017-08-05 20:51:03.299790:I tensorflow / core / distributed_runtime / rpc / grpc_channel.cc:215]初始化工作者的GrpcChannelCache - > {0 - > localhost:2223 2017-08-05 20:51:03.305220:I tensorflow / core / distributed_runtime / rpc / grpc_server_lib.cc:316]启动服务器的目标:grpc:// localhost:2223

在培训时,我遇到相同的信息,没有其他回应 .

E0805 20:52:45.889979901 3387 ev_epoll1_linux.c:1051] grpc epoll fd:3

该信息打印自 with tf.Session("grpc://localhost:2223") as sess:

Tensorflow的版本: 1.3.0-rc0 ,用bazel编译并适用于单机

Linux版本: Distributor ID: Ubuntu Description: Ubuntu 14.04.5 LTS Release: 14.04 Codename: trusty

Active Internet连接是:

Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:2222            0.0.0.0:*               LISTEN      8321/python
tcp        0      0 0.0.0.0:2223            0.0.0.0:*               LISTEN      8883/python

以下是创建群集服务器的示例代码

def main(_):
    server = tf.train.Server(cluster,
                         job_name=FLAGS.job_name,
                         task_index=FLAGS.task_index)
    server.join()

if __name__ == "__main__":
    tf.app.run()

和培训代码

train_X = np.random.rand(100).astype(np.float32)
train_Y = train_X * 0.1 + 0.3

with tf.device("/job:worker/task:0"):
    X = tf.placeholder(tf.float32)
    Y = tf.placeholder(tf.float32)
    w = tf.Variable(0.0)
    b = tf.Variable(0.0)
    y = w * X + b
    loss = tf.reduce_mean(tf.square(y - Y))

    init_op = tf.global_variables_initializer()
    train_op = tf.train.GradientDescentOptimizer(0.01).minimize(loss)

with tf.Session("grpc://localhost:2223") as sess:
    sess.run(init_op)
    for i in range(500):
        sess.run(train_op, feed_dict={X: train_Y, Y: train_Y})
        print("after sess.run train")
        if i % 50 == 0:
            print i, sess.run(w), sess.run(b)

print sess.run(w)
print sess.run(b)

有谁知道如何修理它?谢谢 .