Java 学习之路

0 votes

answers

views

在分布式tensorflow中，如何从工作者写入摘要

我正在使用谷歌 Cloud ml分布式样本在一组计算机上训练模型 . 输入和输出（即rfrecords，checkpoints，tfevents）都在gs：//（谷歌存储）与分布式样本类似，我使用最后调用的评估步骤，结果写为摘要，以便在Cloud ML中使用参数hypertuning，或使用我自己的工具堆栈 . 但是，我不是对大批数据执行单一评估，而是运行多个评估步骤，以便检索有关性能标准的统计...

tensorflow google-cloud-ml
2 votes

answers

views

gcloud与本地培训的结果会更差

我试图理解为什么我的本地结果比gcloud结果更好 . 在本地，我经营这样的工作： gcloud ml-engine local train --module-name trainer.task --package-path trainer -- --vocabulary-file trainer/data/vocab.txt --class-files $CLASS_FILES --job-di...

tensorflow keras gcloud google-cloud-ml
0 votes

answers

views

complex_model_l_gpu应该有8个gpus但没有

我使用以下config.yaml提交了一个带有gpu = 8的keras multi_gpu_model trainingInput: scaleTier: CUSTOM masterType: complex_model_l_gpu workerType: standard_gpu parameterServerType: standard_gpu workerCount: 0 paramete...

google-cloud-ml
0 votes

answers

views

无法从Google机器学习Cloud REST API获得结果

我正在尝试使用Google Cloud Machine学习REST-API ml.jobs.project.create 我提交的最新作业显示作业ID' drivermonitoring20180109335 '. Here on completion of the job, message ' job completed successfully '但我在指定位置看不到任何所需的输出文件 ....

machine-learning google-cloud-platform google-cloud-ml
1 votes

answers

views

错误：无法匹配检查点gs的文件：//obj-detection/train/model.ckpt

我在google cloud ml上运行我的检测模型，并在运行评估脚本时出现此错误 . 我发现this link提到了这个问题，但看起来这个问题还没有解决 . 谁知道如何解决这个问题？任何帮助将不胜感激 . 谢谢 . ERROR 2018-02-04 12:53:10 -0600 master-replica-0无法匹配检查点gs的文件：//obj-detection/train/model.c...

tensorflow object-detection tensorboard google-cloud-ml
-1 votes

answers

views

任何人都可以帮我识别我的Google Cloud ML培训工作中的“错误”吗？

我按照以下链接使用新数据和新模型复制流程： https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_pets.md 在到达最后一步之前，我使用下面的脚本激活训练作业： gcloud ml-engine jobs submit training `whoami`_object_d...

tensorflow google-cloud-platform object-detection google-cloud-ml
0 votes

answers

views

使用自定义估算器api包含的tensorflow代码在google cloud-ml引擎或本地机器中有效使用gpu吗？

我正在谷歌 Cloud ml引擎培训神经网络 . 我使用张量流高级api Build 了网络，如 tf.layers ， tf.losses ， tf.dataset . 代码也包含在使用自定义估算器api中 . 这项工作运行了很长时间 . 网络是如此巨大，它应该使用大量的gpu，但在它显示的ml-engine的作业详细信息页面中，它不使用master cpu以及gpu . 虽然主cpu和gpu...

python tensorflow google-cloud-platform google-cloud-ml tensorflow-estimator
0 votes

answers

views

Cloud ML Engine为自定义tf.estimator分发了培训默认类型

这article表明分布式培训有三种选择具有同步更新的数据并行培训 . 使用异步更新的数据并行培训 . 模型并行培训 . 然后，本教程继续建议后面的代码使用Cloud ML Engine上的异步更新执行数据并行培训，其行为如"If you distribute 10,000 batches among 10 worker nodes, each node works o...

tensorflow google-cloud-platform google-cloud-ml google-cloud-ml-engine
1 votes

answers

views

如何打包Cloud ML Engine的词汇表文件

我有一个.txt文件，每行包含一个不同的标签 . 我使用此文件来创建标签索引查找文件，例如： label_index = tf.contrib.lookup.index_table_from_file(vocabulary_file = 'labels.txt' 我想知道我应该如何使用我的cloud ml-engine打包词汇表文件？ packaging suggestions在如何设置.py文...

python tensorflow google-cloud-platform google-cloud-ml google-cloud-ml-engine
2 votes

answers

views

向Google Cloud ML提交培训工作

我有一个如下代码，我想提交给Google cloud ml . 我已经测试了他们的例子并得到了结果 . from __future__ import absolute_import from __future__ import division from __future__ import print_function import tensorflow as tf import numpy a...

google-cloud-platform google-cloud-ml
4 votes

answers

views

使用Google Cloud Machine Learning服务通过本地重新训练的Inception模型进行预测

我使用来自Google Code Lab TensorFlow for Poets的retrain.py文件在本地重新训练了Inception模型，并希望使用Google Cloud机器学习服务进行预测 . 具体来说，我想修改retrain.py文件，所以我的TensorFlow应用程序已准备就绪 gcloud beta ml预测--instances = INSTANCES --model = ...

tensorflow google-cloud-ml
0 votes

answers

views

启动培训工作时出现Google Cloud Machine Learning Engine问题

我想使用Google Cloud Machine Learning Engine训练一个物体探测器，并且在启动tutorial的训练部分时我被困住了 . 我遇到这个错误： gapic-google-cloud-logging-v2 0.91.3要求google-gax <0.16dev，> = 0.15.7，但你的google-gax 0.12.5是不兼容的 . 甚至在做完之后 p...

google-cloud-platform google-cloud-ml
1 votes

answers

views

如何在Google Cloud ML上定期培训和部署新的机器学习模型？如何自动化这个过程？

我成功地在GCP MLE上训练模型，但现在随着新数据的生成，模型需要每两周更新一次，我手动完成，有人可以帮助我自动化这个过程 . 我目前的架构是我有数据作为存储在GCS存储桶上的csv文件，我手动运行脚本并训练模型并将新训练的模型导出为webservice（通过 Cloud endpoints ），以便用户可以查询新数据并获得推断...... 我想构建一个系统，每2周我会提供新的csv文件，并且...

tensorflow google-cloud-platform google-cloud-functions google-cloud-ml
0 votes

answers

views

Eval仅在训练开始时发布，训练在CloudML上结束

使用下面的代码 - 在使用CloudML进行培训时，eval仅运行两次（开始和结束） . 我希望这个评估至少每10秒一次 . 如果我在本地运行相同的代码，则表现如预期 . eval_spec = tf.estimator.EvalSpec（input_fn = read_dataset（'{} / test *' . format（OUTPUT_GCS），mode = tf.estimator.M...

tensorflow google-cloud-ml
0 votes

answers

views

无法对google cloud ml进行预测，而同一型号正在本地计算机上运行

我正在尝试在谷歌 Cloud 中训练机器学习模型usinf tensorflow库 . 我可以在创建一个桶后在 Cloud 中训练模型 . 当我想使用现有模型进行预测时，我正面临着这个问题 . 代码和数据可在以下Github目录中找到 . https://github.com/terminator172/game-price-predictions Cloud 上的tensorflow版本是1....

python tensorflow machine-learning google-cloud-ml
0 votes

answers

views

从Firebase Cloud 功能调用Google ML引擎预测

我尝试使用邮递员的HTTP API method在我部署的某个模型上调用预测请求，并将此作为响应： {“error”：{“code”：401，“message”：“请求缺少必需的身份验证凭据 . 预期的OAuth 2访问令牌，登录cookie或其他有效的身份验证凭据 . 请参阅https://developers.google.com/ identity / sign-in / web / de...

firebase google-cloud-platform google-cloud-functions google-cloud-ml google-iam
6 votes

answers

views

Estimator预测无限循环

我不明白如何使用TensorFlow Estimator API进行单一预测 - 我的代码导致无限循环，不断预测相同的输入 . 根据documentation，当input_fn引发StopIteration异常时，预测应该停止： input_fn：返回功能的输入函数，它是Tensor或SparseTensor的字符串功能名称字典 . 如果它返回一个元组，则第一个项目被提取为特征 . 预测将继续...

python tensorflow google-cloud-ml
0 votes

answers

views

AttributeError：'module'对象没有属性'LookupTensor'

I am trying to run a training job in Google Cloud using Tensorflow . I tried to run the training using by running the following command. gcloud ml-engine jobs submit training training_1 \ --job-dir=gs...

tensorflow google-cloud-platform google-cloud-ml
1 votes

answers

views

Gcloud ml-engine：FAILED_PRECONDITION：字段：package_uris错误

我是gcloud的新手，我正试图通过以下this tutorial向gcloud提交ML工作 . 我在提交工作时遇到了错误 . 这是完整的日志 . sam @sam-VirtualBox：〜/ models / research $ gcloud ml-engine jobs提交培训whoami_object_detection_date％s --job-dir = gs：// tf_testi...

gcloud google-cloud-ml
0 votes

answers

views

执行Tensorflow教程时出错

我按照这个tutorial在Google Cloud 端ml引擎上进行培训 . 我一步一步地遵循它，但是当我将ml作业提交给 Cloud 时我遇到了错误 . 我跑了这个命令 . sam @sam-VirtualBox：〜/ models / research $ gcloud ml-engine jobs提交培训whoami_object_detection_date％s --job-dir =...

tensorflow gcloud google-cloud-ml
1 votes

answers

views

谷歌 Cloud 平台，ML引擎，“没有名为absl的模块”

我正在尝试按照以下教程使用TensorFlow训练对象检测器：https://cloud.google.com/blog/products/gcp/training-an-object-detector-using-cloud-machine-learning-engine 该教程要求使用 object_detection.train ，但是这已经移到了遗产，所以我使用了 object_detec...

python tensorflow google-cloud-platform object-detection google-cloud-ml
0 votes

answers

views

张量流宽线性模型推理对gpu的缓慢影响

我正在训练一个关于张量流的稀疏逻辑回归模型 . 该问题具体涉及推理部分 . 我正在尝试对cpu和gpu进行基准测试 . 我在我目前的GCE盒子上使用Nvidia P100 gpu（4个模具） . 我是gpu的新手，很抱歉天真的问题 . 该模型相当大〜54k操作（与dnn或imagenet模型相比，它被认为是大的吗？） . 当我记录设备放置时，我只看到正在使用的gpu：0，其余的未使用？我不会在训练...

tensorflow gpu tensorflow-serving google-cloud-ml tensorrt
0 votes

answers

views

无法扩展广泛和深入的模型来训练谷歌 Cloud ML

我正在尝试构建一个广泛而深入的张量流模型，并在谷歌 Cloud 上进行训练 . 我已经能够做到这一点，并培养较小的开发版本 . 但是，我现在正在努力扩展到更多的数据和更多的培训步骤，我的在线培训工作仍然失败 . 它运行5分钟左右然后我得到以下错误： The replica worker 2 exited with a non-zero status of 1. Termination reason...

tensorflow google-cloud-ml google-cloud-ml-engine
1 votes

answers

views

没有日志，也没有来自Google Cloud ML培训作业的输出

我正在尝试在Google的Cloud ML上运行培训工作 . 我工作的迹象是：这些消息表明包已构建并安装： INFO 2017-06-07 15:14:01 -0700 master-replica-0成功构建training-job-foo INFO 2017-06-07 15:14:01 -0700 master-replica-0安装收集包：培训 - job-foo INFO 201...

machine-learning tensorflow google-cloud-ml google-cloud-ml-engine
1 votes

answers

views

使用gcloud计算单元而不是本地计算单元时，未正确保存Tensorflow检查点

当我使用谷歌 Cloud 桶作为数据源和目的地进行本地培训时： gcloud ml-engine local train --module-name trainer.task_v2s --package-path trainer/ 我获得了正常的结果，检查点在20个seps中正确保存，因为我的数据集是400个示例，我使用20作为批量大小：400/20 = 20个步骤= 1个Epoch . 这些文...

python-3.x tensorflow gcloud google-cloud-ml tensorflow-estimator
0 votes

answers

views

找不到检查点文件，恢复评估图

我有一个模型，它以分布式模式运行4000步 . 每120秒后计算一次精度（如提供的例子中所做的那样） . 但是，有时找不到最后一个检查点文件 . 错误：无法匹配检查点gs的文件：//path-on-gcs/train/model.ckpt-1485 检查点文件出现在该位置 . 2000步的本地运行完美运行 . last_checkpoint = tf.train.latest_checkp...

tensorflow google-cloud-ml
2 votes

answers

views

Google对象检测API - 使用faster_rcnn_resnet101_coco模型进行培训

我使用mobilenet模型来训练我的图像 . 它工作正常 . 为了提高准确性，我尝试使用faster_rcnn_resnet101_coco模型来复制相同的步骤 . 我使用的所有步骤都是一样的 . 当我开始训练课程时，它开始运行大约800步 . 此时的训练损失约为0.5，这似乎太好了 . 它在此步骤停止并抛出以下错误：副本工作者1以非零状态退出1.终止原因：错误 . 回溯（最近一次调用最后一...

python tensorflow google-cloud-platform google-cloud-ml
1 votes

answers

views

在本地运行Google数据流以进行图像识别

我目前正在使用tensorflow和Google Cloud Platform进行传输学习 . https://cloud.google.com/blog/big-data/2016/12/how-to-train-and-classify-images-using-google-cloud-machine-learning-and-cloud-dataflow 当我使用他们的示例代码时，它在我...

machine-learning tensorflow google-cloud-platform image-recognition google-cloud-ml
1 votes

answers

views

Google Cloud ML Engine Tensorflow在input_fn（）中执行预处理/标记化

我想在输入函数中执行基本的预处理和标记化 . 我的数据包含在谷歌 Cloud 存储桶位置（gs：//）中的csv中，我无法修改 . 此外，我在ml-engine包中对输入文本执行任何修改，以便可以在服务时复制行为 . 我的输入函数遵循以下基本结构： filename_queue = tf.train.string_input_producer(filenames) reader = tf.Text...

python tensorflow google-cloud-platform google-cloud-ml google-cloud-ml-engine

热门问题