Tensorflow不使用GPU（根据TensorBoard）-Java 学习之路

编辑：GTX 1070，ubuntu 16.04，git hash：3b75eb34ea2c4982fb80843be089f02d430faade

我在我自己的数据上重新训练inception模型 . 一切都很好，直到最后的命令：

bazel-bin/inception/flowers_train \
  --config=cuda \
  --train_dir="${TRAIN_DIR}" \
  --data_dir="${OUTPUT_DIRECTORY}" \
  --pretrained_model_checkpoint_path="${MODEL_PATH}" \
  --fine_tune=True \
  --initial_learning_rate=0.001 \
  --input_queue_memory_factor=1

根据日志， Tensorflow seems to be using the GPU ：

I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 0 with properties:
name: GeForce GTX 1070
major: 6 minor: 1 memoryClockRate (GHz) 1.7715
pciBusID 0000:03:00.0
Total memory: 7.92GiB
Free memory: 7.77GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:972] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 0:   Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1070, pci bus id: 0000:03:00.0)

但是当我在TensorBoard中检查学习时， the net is using mainly the CPU （蓝色/设备：CPU：0，绿色/设备：GPU：0）：

TensorBoard图：

TensorBoard graph

我试过这两个TensorFlow设置：

从源代码安装nvidia-367驱动程序，CUDA8 8.0，cuDNN v5，来自主服务器的源代码（16/10/06 - r11？） . 编译为GPU使用：

bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
bazel-bin/tensorflow/cc/tutorials_example_trainer --use_gpu    
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

docker在具有GTX 1070 8Go的PC上使用Tensorflow的GPU映像

nvidia-docker run -it -p 8888:8888 -p 6006:6006 gcr.io/tensorflow/tensorflow:latest-gpu /bin/bash

有帮助吗？

1 回答

根据this issue，初始'tower'是正在执行大部分工作的地方 . 所以看起来很好 .

除了还有一些奇怪的东西 . 运行 watch nvidia-smi 给出：

2016年10月10日星期一10:31:04

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.48                 Driver Version: 367.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 0000:03:00.0      On |                  N/A |
| 29%   57C    P2    41W / 230W |   7806MiB /  8113MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      1082    G   /usr/lib/xorg/Xorg                              69MiB |
|    0      3082    C   /usr/bin/python                               7729MiB |
+-----------------------------------------------------------------------------+

顶部给出： PID UTIL. PR NI VIRT RES SHR S %CPU %MEM TEMPS+ COM. 3082 root 20 0 26,739g 3,469g 1,657g S 101,3 59,7 7254:50 python

GPU似乎被忽略了......

回复于 2024-04-25T06:05:23+08:00

Tensorflow不使用GPU（根据TensorBoard）

1 回答

相关问题