我想使用Tensorflow c API计算渐变 . 我得到了tf.gradients在Python API中工作:
import tensorflow as tf
with tf.Session() as sess:
a = tf.Variable([[5.0,1.0,2.0]], name='a')
b = tf.Variable([[9.0,2.0,0.0]], name='b')
c = tf.matmul(a, tf.transpose(b), name="c")
deriv = tf.gradients(c,a, name="deriv") #\partial c/\partial a
sess.run(tf.global_variables_initializer())
print(sess.run(deriv))
#### get the tensor by name ####
tmp = tf.get_default_graph().get_tensor_by_name("deriv/c_grad/MatMul:0")
print(sess.run(tmp))
#### save graph ####
tf.train.write_graph(sess.graph_def, './', 'graph.pb', as_text=False)
tf.train.write_graph(sess.graph_def, './', 'graph.pbtxt', as_text=True)
saver = tf.train.Saver()
saver.save(sess, "variable", global_step=0)
我从 print(sess.run(deriv))
和 print(sess.run(tmp))
得到了正确答案,并生成了graph.pb . 然后我尝试使用以下c代码加载图形并以与我在Python中类似的方式进行渐变计算 .
#include "tensorflow/core/public/session.h"
#include "tensorflow/core/platform/env.h"
#include "tensorflow/cc/framework/ops.h"
void checkStatus(const tensorflow::Status& status) {
if (!status.ok()) {
std::cout << status.ToString() << std::endl;
exit(1);
}
}
int main(int argc, char** argv) {
namespace tf = tensorflow;
tf::Session* session;
tf::Status status = tf::NewSession(tf::SessionOptions(), &session);
checkStatus(status);
tf::GraphDef graph_def;
status = ReadBinaryProto(tf::Env::Default(), "graph.pb", &graph_def);
checkStatus(status);
status = session->Create(graph_def);
checkStatus(status);
tf::Input::Initializer xi({1.0,2.0,-6.0});
tf::Input::Initializer yi({9.0,2.0,0.0});
std::vector<std::pair<tf::string, tf::Tensor>> input_tensors = {{"a", xi.tensor}, {"b", yi.tensor}};
std::vector<tf::Tensor> output_tensors;
status = session->Run(input_tensors, {"deriv/c_grad/MatMul"}, {}, &output_tensors);
checkStatus(status);
tf::Tensor output = output_tensors[0];
auto out = output.vec<float>();
std::cout << out(0) << " " << out(1) << " " << out(2) << std::endl;
session->Close();
return 0;
}
编译运行顺利,但我遇到了运行时错误:
内部:double类型的输出0与声明的输出类型float不匹配节点_recv_b_0 = _Recvclient_terminated = true,recv_device =“/ job:localhost / replica:0 / task:0 / cpu:0”,send_device =“/ job:localhost / replica:0 / task:0 / cpu:0“,send_device_incarnation = -3399984051910545345,tensor_name =”b“,tensor_type = DT_FLOAT,_device =”/ job:localhost / replica:0 / task:0 / cpu:0“
除了“deriv / c_grad / MatMul”之外,我还尝试了graph.pbtxt中与“deriv”相关的所有其他名称,但它们都不起作用 .
所以,我的问题是如何引用梯度计算的节点?似乎Python接受"deriv/c_grad/MatMul"但c没有 . 我也想知道是否有更一般的方法来做到这一点 . 因为我可以区分神经网络输出而不是点产品w.r.t.未来的输入张量, MatMul
将不再是最后一步 .
1 回答
看起来您没有正确初始化 yi 输入张量 .
在您的示例中,您将 yi 初始化为双精度:
你可能应该尝试(注意添加 f ):