Difference between revisions of "Installing TensorFlow"

From edegan.com
Jump to navigation Jump to search
 
(21 intermediate revisions by 2 users not shown)
Line 14: Line 14:
 
https://stackoverflow.com/questions/46499808/pip-throws-typeerror-parse-got-an-unexpected-keyword-argument-transport-enco#_=_
 
https://stackoverflow.com/questions/46499808/pip-throws-typeerror-parse-got-an-unexpected-keyword-argument-transport-enco#_=_
  
=New (by Wei and Minh): Tensorflow 1.9.0 with GPU Installation Log=
+
=Tensorflow 1.9.0 with GPU Installation Log=
'''Important note: install the version of software/packages strictly according to the instructions provided by Tensorflow. A different version of software, for example CUDA toolkit 9.2 instead of 9.0, might lead to failure in tensorflow.'''  
+
'''Important note:'''<br>
 +
Install the version of software/packages strictly according to the instructions provided by Tensorflow. A different version of software, for example CUDA toolkit 9.2 instead of 9.0, might lead to failure in tensorflow. When upgrading tensorflow, do it very carefully. As of July 2018, Tensorflow is [https://github.com/tensorflow/tensorflow/issues/17629 notoriously easy to break] with careless installation. DO NOT attempt to install Tensorflow under your user account. Tensorflow has been installed for all users, and a new local install will interfere with it. 
 +
==Synopsis==
 +
Tensorflow was previously installed. In 2018 Summer, a new piece of graphics card was installed on DB Server. Wei and Minh hence-force installed and configured '''tensorflow-gpu 1.9.0 for Python3.6''' for all users of DB Server.
 +
 
 +
==Using Tensorflow==
 +
It is important to know that, on DB Server, Tensorflow-gpu 1.9.0 is installed for ''python3.6'', instead of either the default ''python3'' which is Python 3.5, or the default ''python'' which is Python 2.7 . In case that the system default ''python3'' might be changed, type in terminal to find out:
 +
which python3
 +
and
 +
which python3.6 
 +
 
 +
A quick test of whether tensorflow-gpu is working for ''python3.6'', type the following into a terminal:
 +
python3.6 -c "import tensorflow as tf; sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))"
 +
 
 +
This will report back which CPU and GPU devices the tensorflow is using. If there is no information for the GPU device, there is something wrong.
 +
 
 
==NVIDIA configuration==
 
==NVIDIA configuration==
(In progress) Before installing tensorflow with GPU, configure the NVIDIA® software by following instruction: https://www.tensorflow.org/install/install_linux#NVIDIARequirements
+
Before installing tensorflow with GPU, configure the NVIDIA® software by following instruction: https://www.tensorflow.org/install/install_linux#NVIDIARequirements
 
===Install CUDA Toolkit 9.0===
 
===Install CUDA Toolkit 9.0===
 
*1. Installed CUDA Toolkit 9.0 Base Installer with the Runfile option. The toolkit is in  
 
*1. Installed CUDA Toolkit 9.0 Base Installer with the Runfile option. The toolkit is in  
Line 39: Line 54:
 
Add
 
Add
 
  export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
 
  export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
  export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64\${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
+
  export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64\
 +
                        ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
 
Save and exit. Close and open the terminal (or source .bashrc).
 
Save and exit. Close and open the terminal (or source .bashrc).
  
Line 45: Line 61:
 
  nvcc -V
 
  nvcc -V
 
[[File:nvcc.png]]
 
[[File:nvcc.png]]
 +
 
===Install cuDNN v7.1.4===
 
===Install cuDNN v7.1.4===
 
*5. Downloaded cuDNN v7.1.4 for CUDA 9.0:  
 
*5. Downloaded cuDNN v7.1.4 for CUDA 9.0:  
Line 70: Line 87:
 
*9. Added the following path to the LD-LIBRARY_PATH environment variable by accessing bash as per above:
 
*9. Added the following path to the LD-LIBRARY_PATH environment variable by accessing bash as per above:
 
   export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:+${LD_LIBRARY_PATH}:}/usr/local/cuda/extras/CUPTI/lib64
 
   export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:+${LD_LIBRARY_PATH}:}/usr/local/cuda/extras/CUPTI/lib64
 +
 
===Install TensorRT 3.0 (optional)===
 
===Install TensorRT 3.0 (optional)===
 
*10.Did not install TensorRT 3.0
 
*10.Did not install TensorRT 3.0
Line 78: Line 96:
 
  $ export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}</s><br>
 
  $ export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}</s><br>
 
3. <s>If installed correctly, type nvcc- V should verify installation. But currently it returns 'the program nvcc is currently not installed'.</s><br>
 
3. <s>If installed correctly, type nvcc- V should verify installation. But currently it returns 'the program nvcc is currently not installed'.</s><br>
 +
4. When adding libcupti-dev library, after adding the path:
 +
  export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:+${LD_LIBRARY_PATH}:}/usr/local/cuda/extras/CUPTI/lib64
 +
Upon source .bashrc, it returns the following:
 +
  -bash: export: `:/usr/local/cuda-9.0/lib64:/usr/local/cuda-9.0/extras/CUPTI/lib64': not a valid identifier
 +
 +
So far it does not affect the functionality of tensorflow, but it will probably affect libcupti-dev library.
  
==Tensorflow Installation Resource==
+
==Tensorflow with GPU support Installation in a virtual environment==
To install tensorflow, follow this instruction here: https://www.tensorflow.org/install/install_linux#InstallingVirtualenv and install tensorflow.
+
We followed this instruction here: https://www.tensorflow.org/install/install_linux#InstallingVirtualenv  
===Install Tensorflow using the Virtual Environment===
+
===Installation===
 
Install on DBServer under the user McNair. Password: askEd
 
Install on DBServer under the user McNair. Password: askEd
 
*1.install virtualenv:
 
*1.install virtualenv:
Line 98: Line 122:
 
  pip install -U pip
 
  pip install -U pip
 
*5. Install TensorFlow in the virtual environment: within  
 
*5. Install TensorFlow in the virtual environment: within  
   pip install -U tensorflow
+
   pip install -U tensorflow-gpu
 
*Validate the installation with:
 
*Validate the installation with:
 
  (venv)$ python -c "import tensorflow as tf; print(tf.__version__)"
 
  (venv)$ python -c "import tensorflow as tf; print(tf.__version__)"
Line 104: Line 128:
 
[[FILE:tensorflow.png]]
 
[[FILE:tensorflow.png]]
  
==Testing Tensorflow with GPU==
+
===Testing Tensorflow with GPU in virtual environment===
 
Create a python file with the following:
 
Create a python file with the following:
 +
import tensorflow as tf
 
  sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
 
  sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
 
Run it in the virtual environment.  
 
Run it in the virtual environment.  
 
[[FILE:TensorflowGPU.png]]
 
[[FILE:TensorflowGPU.png]]
 +
 +
==Tensorflow with GPU support Installation as root for All users==
 +
'''Important note: Currently on DB Server, pip/pip3 is working with Python3.6 rather than Python3. Hence the following installs a copy of tensorflow-gpu 1.9.0 for Python3.6 for all users'''
 +
 +
===Installation===
 +
We followed instructions here: https://www.tensorflow.org/install/install_linux#InstallingNativePip
 +
 +
*0. Deleted previously installed tensorflow with CPU support:
 +
sudo pip3 uninstall tensorflow
 +
*1. Used this command to install tensorflow-gpu:
 +
  sudo pip3 install -U tensorflow-gpu
 +
 +
===Path variable (crucial)===
 +
If you logged on as a user using tensorflow for the first time, you need to set the CUDA Toolkit 9.0 environment variables. Type into terminal
 +
nano .bashrc
 +
 +
Add the following:
 +
export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
 +
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64\
 +
                        ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
 +
 +
Save and exit (CTRL + O and CTRL + X). Type
 +
source .bashrc
 +
 +
===Testing Tensorflow with GPU as (non-root) user===
 +
After ssh onto DB Server, type the following command into a terminal:
 +
python3.6 -c "import tensorflow as tf; print(tf.__version__);sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))"
 +
 +
[[FILE:TensorflowGPUGlobal.png]]

Latest revision as of 14:19, 16 July 2018

Old

Currently installed with Anaconda Python 3.

https://stackoverflow.com/questions/36355073/upgrading-numpy-fails-with-permission-denied-error

https://www.tensorflow.org/install/install_windows

with cpu support only

https://www.tensorflow.org/install/install_linux

need to logoff other users via server manager

https://stackoverflow.com/questions/46499808/pip-throws-typeerror-parse-got-an-unexpected-keyword-argument-transport-enco#_=_

Tensorflow 1.9.0 with GPU Installation Log

Important note:
Install the version of software/packages strictly according to the instructions provided by Tensorflow. A different version of software, for example CUDA toolkit 9.2 instead of 9.0, might lead to failure in tensorflow. When upgrading tensorflow, do it very carefully. As of July 2018, Tensorflow is notoriously easy to break with careless installation. DO NOT attempt to install Tensorflow under your user account. Tensorflow has been installed for all users, and a new local install will interfere with it.

Synopsis

Tensorflow was previously installed. In 2018 Summer, a new piece of graphics card was installed on DB Server. Wei and Minh hence-force installed and configured tensorflow-gpu 1.9.0 for Python3.6 for all users of DB Server.

Using Tensorflow

It is important to know that, on DB Server, Tensorflow-gpu 1.9.0 is installed for python3.6, instead of either the default python3 which is Python 3.5, or the default python which is Python 2.7 . In case that the system default python3 might be changed, type in terminal to find out:

which python3

and

which python3.6   

A quick test of whether tensorflow-gpu is working for python3.6, type the following into a terminal:

python3.6 -c "import tensorflow as tf; sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))"

This will report back which CPU and GPU devices the tensorflow is using. If there is no information for the GPU device, there is something wrong.

NVIDIA configuration

Before installing tensorflow with GPU, configure the NVIDIA® software by following instruction: https://www.tensorflow.org/install/install_linux#NVIDIARequirements

Install CUDA Toolkit 9.0

  • 1. Installed CUDA Toolkit 9.0 Base Installer with the Runfile option. The toolkit is in
/usr/local/cuda-9.0 

for the toolkit. Did NOT install NVDIA accelerated Graphics Driver for Linux-x86_64 384.81 (We believe we have a different graphic driver. we have a much Newer version(396.26)). Installed the CUDA 9.0 samples in

HOME/MCNAIR/CUDA-SAMPLES.
  • 2. Installed Patch 1, 2 and 3. The command to install was
sudo sh cuda_9.0.176.2_linux.run # (9.0.176.1 for patch 1 and 9.0.176.3 for patch 3)
  • 3. Set up the environment variables:

The PATH variable needs to include /usr/local/cuda-9.0/bin To add this path to the PATH variable:

export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}

In addition, when using the runfile installation method, the LD_LIBRARY_PATH variable needs to contain /usr/local/cuda-9.0/lib64 on a 64-bit system To change the environment variables for 64-bit operating systems:

export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64\${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Note that the above paths change when using a custom install path with the runfile installation method.
To accomplish this:

nano /home/mcnair/.bashrc

Add

export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64\
                        ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Save and exit. Close and open the terminal (or source .bashrc).

  • 4. To verify CUDA Toolkit 9.0 is installed, type
nvcc -V

Nvcc.png

Install cuDNN v7.1.4

  • 5. Downloaded cuDNN v7.1.4 for CUDA 9.0:

In order to download cuDNN, ensure you are registered for the NVIDIA Developer Program. Then Go to: NVIDIA cuDNN home page. -> Click Download. -> Complete the short survey and click Submit. -> Accept the Terms and Conditions. A list of available download versions of cuDNN displays. -> Select the cuDNN version you want to install. Chose the tar file.

  • 6. Install cuDNN: your CUDA directory path is referred to as
/usr/local/cuda/

your cuDNN download path is referred to as

<cudnnpath>

Follow these commands: a. Navigate to your <cudnnpath> directory containing the cuDNN Tar file. b. Unzip the cuDNN package.

$ tar -xzvf cudnn-9.0-linux-x64-v7.tgz

c. Copy the following files into the CUDA Toolkit directory.

$ sudo cp cuda/include/cudnn.h /usr/local/cuda/include
$ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
$ sudo chmod a+r /usr/local/cuda/include/cudnn.h
/usr/local/cuda/lib64/libcudnn*

Install GPU drivers

  • 7. Did not need to install the GPU drivers because we already had the correct version.

Install libcupti-dev library

  • 8.Tried to install the libcupti-dev library with:
sudo apt-get install cuda-command-line-tools-9-0

but apparently it was already installed. (How surprising!)

LD-LIBRARY_PATH environment variable modification

  • 9. Added the following path to the LD-LIBRARY_PATH environment variable by accessing bash as per above:
 export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:+${LD_LIBRARY_PATH}:}/usr/local/cuda/extras/CUPTI/lib64

Install TensorRT 3.0 (optional)

  • 10.Did not install TensorRT 3.0

Problem encountered

1. In usr/local/ we found files 'CUDA-9.2' and 'CUDA-8.0'. These were probably installed in the past.
2. When execute the following command in a terminal, it returns 'PATH: command not found'.

$ export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}

3. If installed correctly, type nvcc- V should verify installation. But currently it returns 'the program nvcc is currently not installed'.
4. When adding libcupti-dev library, after adding the path:

 export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:+${LD_LIBRARY_PATH}:}/usr/local/cuda/extras/CUPTI/lib64

Upon source .bashrc, it returns the following:

 -bash: export: `:/usr/local/cuda-9.0/lib64:/usr/local/cuda-9.0/extras/CUPTI/lib64': not a valid identifier

So far it does not affect the functionality of tensorflow, but it will probably affect libcupti-dev library.

Tensorflow with GPU support Installation in a virtual environment

We followed this instruction here: https://www.tensorflow.org/install/install_linux#InstallingVirtualenv

Installation

Install on DBServer under the user McNair. Password: askEd

  • 1.install virtualenv:

Surprise again! Someone already installed it! Did not install virtualenv again.

  • 2. Create a directory for the virtual environment and choose python 3 interpreter
 mkdir ~/tensorflow  # somewhere to work out of
 cd ~/tensorflow
 # Choose one of the following Python environments for the ./venv directory:
 virtualenv --system-site-packages -p python3 venv # Use Python 3.n

NOTE: python2 DOES NOT WORK WITH GPU

  • 3. Activate the Virtualenv environment:
 source ~/tensorflow/venv/bin/activate      # bash
  • 4. Upgrade pip:
pip install -U pip
  • 5. Install TensorFlow in the virtual environment: within
 pip install -U tensorflow-gpu
  • Validate the installation with:
(venv)$ python -c "import tensorflow as tf; print(tf.__version__)"

Tensorflow.png

Testing Tensorflow with GPU in virtual environment

Create a python file with the following:

import tensorflow as tf
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

Run it in the virtual environment. TensorflowGPU.png

Tensorflow with GPU support Installation as root for All users

Important note: Currently on DB Server, pip/pip3 is working with Python3.6 rather than Python3. Hence the following installs a copy of tensorflow-gpu 1.9.0 for Python3.6 for all users

Installation

We followed instructions here: https://www.tensorflow.org/install/install_linux#InstallingNativePip

  • 0. Deleted previously installed tensorflow with CPU support:
sudo pip3 uninstall tensorflow
  • 1. Used this command to install tensorflow-gpu:
 sudo pip3 install -U tensorflow-gpu

Path variable (crucial)

If you logged on as a user using tensorflow for the first time, you need to set the CUDA Toolkit 9.0 environment variables. Type into terminal

nano .bashrc

Add the following:

export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64\
                        ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Save and exit (CTRL + O and CTRL + X). Type

source .bashrc 

Testing Tensorflow with GPU as (non-root) user

After ssh onto DB Server, type the following command into a terminal:

python3.6 -c "import tensorflow as tf; print(tf.__version__);sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))"

TensorflowGPUGlobal.png