我采取的步骤如下 . 在 ./configure
步骤期间,除非我允许cuDNN版本(v5)的系统默认值而不是指定v5.1.5(如我所愿),否则我会收到一条错误消息,指出cuDNN的环境版本(v5)不会感到困惑 .
更重要的是,在通过第240行挖掘https://github.com/tensorflow/tensorflow/blob/master/third_party/gpus/cuda_configure.bzl之后,我不确定它将如何安装版本5.1.5 . 也许我看错了?
在任何情况下,有没有人有一个方法在g2.2xlarge实例上安装带有CUDA 8.0和Tensorflow 0.12的cuDNN 5.1.5?
谢谢!
步骤
(注意:这些工作,但是根据需要安装cuDNN 5.0,而不是5.1.5)
供应
- 遵循以下配置步骤:https://medium.com/@giltamari/tensorflow-getting-started-gpu-installation-on-ec2-9b9915d95d6f#.2hv67eeek(即最多但不包括:
sudo apt-get update && sudo apt-get -y upgrade
)
安装依赖项和工具
-
熟悉自己:
-
http://expressionflow.com/2016/10/09/installing-tensorflow-on-an-aws-ec2-p2-gpu-instance/
-
http://ramhiser.com/2016/01/05/installing-tensorflow-on-an-aws-ec2-instance-with-gpu-support/
-
sudo apt-get update
-
sudo apt-get upgrade
-
sudo apt-get install -y build-essential git python-pip libfreetype6-dev libxft-dev libncurses-dev libopenblas-dev gfortran python-matplotlib libblas-dev liblapack-dev libatlas-base-dev python-dev python-pydot linux-headers-generic linux-image-extra-virtual unzip python-numpy swig python-pandas python-sklearn unzip wget pkg-config zip g++ zlib1g-dev libcurl3-dev
-
sudo pip install -U pip
安装Cuda 8
-
wget https://developer.nvidia.com/compute/cuda/8.0/prod/local_installers/cuda-repo-ubuntu1604-8-0-local_8.0.44-1_amd64-deb
-
sudo dpkg -i cuda-repo-ubuntu1604-8-0-local_8.0.44-1_amd64-deb
-
rm cuda-repo-ubuntu1604-8-0-local_8.0.44-1_amd64-deb
-
sudo apt-get update
-
sudo apt-get install -y cuda
安装cuDNN
-
我们要下载并安装最新版本的cuDNN . 下载cuDNN需要登录NVIDIA开发者网站,因此我们无法使用wget来获取文件 . 从NVIDIA下载以下文件并将其上传到您的AWS实例 .
-
在linux上下载适用于CUDA 8.0的cuDNN 5.1
-
scp -i ssh-key.pem path / to / downloaded / cudnn ubuntu @ ec2 .us-west-1.compute.amazonaws.com:〜/
-
sudo tar -xzvf cudnn-8.0-linux-x64-v5.1.tgz
-
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
-
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
-
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
配置环境
-
在
~/.bashrc
结尾处放下:export CUDA_HOME=/usr/local/cuda
export CUDA_ROOT=/usr/local/cuda
export PATH=$PATH:$CUDA_ROOT/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_ROOT/lib64:$CUDA_ROOT/extras/CUPTI/lib64
-
source ~/.bashrc
-
sudo reboot
安装Bazel
-
sudo add-apt-repository -y ppa:webupd8team/java
-
sudo apt-get update
-
echo debconf shared/accepted-oracle-license-v1-1 select true | sudo debconf- set-selections
-
echo debconf shared/accepted-oracle-license-v1-1 seen true | sudo debconf- set-selections
-
sudo apt-get install -y oracle-java8-installer
-
sudo apt-get install pkg-config zip g++ zlib1g-dev
-
scp
https://github.com/bazelbuild/bazel/releases/download/0.3.2/bazel-0.3.2-installer-linux-x86_64.sh
从本地机器到ec2实例 -
chmod +x bazel-0.1.4-installer-linux-x86_64.sh
-
./bazel-0.1.4-installer-linux-x86_64.sh --user
-
rm bazel-0.1.4-installer-linux-x86_64.sh
-
bazel version
构建和安装Tensorflow
-
git clone --recurse-submodules https://github.com/tensorflow/tensorflow
-
cd tensorflow
-
TF_UNOFFICIAL_SETTING=1 ./configure
-
命中所有输入/默认值除外:
-
CUDA版本= 8.0,CUDA依赖= 3.0(k520 gpu)
-
bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
-
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
-
sudo pip install --upgrade /tmp/tensorflow_pkg/tensorflow-0.12.0rc1-cp27-cp27mu-linux_x86_64.whl