HPC Grid Tutorial: How to Run a Job for Tensorflow

Follow these steps to run a Tensorflow job. Note: Make sure you have access to nodes with GPU's. 

1. Log on to the Grid.

2. Copy the required contents using the following command:

cp -R /wsu/el7/scripts/tutorial/addition.py .

cp -R /wsu/el7/scripts/tutorial/tensorflow_job .

3. The file that has the job script is tensorflow_job. It contains the following script:


#PBS -N Tensorflow

#PBS -l select=1:ngpus=1 -q accq

#PBS -m ea


ml python/3.7

source /wsu/e17/pre-compiled/python/3.7/etc/profile.d/conda.sh

conda activate tensorflow_env

python addition.py

Note: Make sure that addition.py is in your home directory.

4. To submit the job, type: qsub tensorflow_job 

5. The job is submitted. You can check to see what node it is running on with the following command: qme

The red boxes show the job ID, the pink one shows the name of the script and the blue one shows the node name on which the job is running. The output can be found in the Tensorflow.oJOB_ID