HPC Grid Tutorial: How to Schedule Jobs on the Grid

NOTE: This page is not a replacement for the manual. It is supplementary material and should be used only after reading PBS documentation: PBS Professional User’s Guide and PBS Professional Reference Guide. For users that maintain their own environments and shells, the PBS Pro binaries and man pages are installed in /usr/pbs/bin/ and /usr/pbs/man respectively. 

When logging onto the Grid, you are connected to the head node called Warrior. Compute nodes are accessed through the PBS Scheduler with either a batch job or an interactive (shell) job. Individual compute nodes cannot be accessed directly unless a user has job(s) running on said node. For information regarding available queues and nodes, refer to the Available HPC Nodes page.

Submitting Specific Types of Jobs

The examples below are samples of possible job files, which contain a script used for running a job. Job files can be used to define a shell, options, resources, and the job to be run. These job files are executed with the qsub command. These examples can be taken and modified to suit your needs. When creating a job file, be sure that there are no unwanted special characters or line breaks.

Single Threaded Applications

#!/bin/bash

#PBS -l ncpus=1

## The user requests one processor

#PBS -m ea

## Specifies an email to be sent when the job aborts or completes

#PBS -M zz9992@wayne.edu

## Specifies the email recipient

#PBS -o /wsu/home/zz/zz99/zz99992/pbs/job_name/completed/output_file

## Output to be written to the file 'output_file'

#PBS -e /wsu/home/zz/zz99/zz9992/pbs/job_name/completed/error_file

## Errors to be written to the file 'error_file'

 

cd $TMPDIR

## The user moves to the directory in /tmp called $TMPDIR created by PBS 

 

cp /wsu/home/zz/zz99/zz9992/pbs/job_name/script_file $TMPDIR

cp /wsu/home/zz/zz99/zz9992/pbs/job_name/data_file $TMPDIR

## The user copies the script and data files to the $TMPDIR directory

 

$TMPDIR/script_file\

## The script 'script_file' is executed

 

mv $TMPDIR/* /wsu/home/zz/zz99/zz9992/pbs/job_name/completed/.

## The user moves all the files in $TMPDIR to their home directory

 

Shared Memory Applications

#!/bin/bash

#PBS -l ncpus=2

## The user requests one processor. Since the user is not using the 'nodes' keyword, both processors will be on the same computer and PBS assumes the job will use shared memory.

#PBS -m ea

## Specifies an email to be sent when the job aborts or completes

#PBS -M zz9992@wayne.edu

## Specifies the email recipient

#PBS -o /wsu/home/zz/zz99/zz9992/pbs/job_name/completed/output_file

## Output to be written to the file 'output_file'

#PBS -e /wsu/home/zz/zz99/zz9992/pbs/job_name/completed/error_file

## Errors to be written to the file 'error_file'

 

cd $TMPDIR

## The user moves to the directory in /tmp called $TMPDIR created by PBS 

 

cp /wsu/home/zz/zz99/zz9992/pbs/job_name/script_file $TMPDIR

cp /wsu/home/zz/zz99/zz9992/pbs/job_name/data_file $TMPDIR

## The user copies the script and data files to the $TMPDIR directory

 

$TMPDIR/script_file

## The script 'script_file' is executed

mv $TMPDIR/* /wsu/home/zz/zz99/zz9992/pbs/job_name/completed/.

## The user moves all the files in $TMPDIR to their home directory 

 

MPICH Applications

#!/bin/bash

#PBS -l ncpus=8

## The user requests eight processors

#PBS -l nodes=2:ppn=4

## The user requests four processors on each of the two nodes. The user is using the 'nodes' keyword, the eight processors will be on different computers and PBS assumes the job is a message passing interface (MPI) job 

#PBS -m ea

## Specifies an email to be sent when the job aborts or completes. 

#PBS -M zz9992@wayne.edu

## Specifies the email recipient 

#PBS -o /wsu/home/zz/zz99/zz9992/pbs/job_name/completed/output_file

## Output to be written to the file 'output_file'

#PBS -e /wsu/home/zz/zz99/zz9992/pbs/job_name/completed/error_file

## Errors to be written to the file 'error_file'

 

mpirun -machinefile $PBS_NODEFILE -np 8

/wsu/home/zz/zz99/zz9992/pbs/job_name/script_file

## The user executes 'script_file' using the mpirun wrapper

 

MPICH2

#!/bin/bash

#PBS -l select=1:ncpus=1:host=hostA+2:ncpus=1:host=hostB

## Three MPD daemons are started. 

#PBS -m ea

## Specifies an email to be sent when the job aborts or completes. 

#PBS -M zz9992@wayne.edu

## Specifies the email recipient 

#PBS -o /wsu/home/zz/zz99/zz9992/pbs/job_name/completed/output_file

## Output to be written to the file 'output_file'

#PBS -e /wsu/home/zz/zz99/zz9992/pbs/job_name/completed/error_file

## Errors to be written to the file 'error_file'

 

mpirun -np 3 /wsu/home/zz/zz99/zz9992/pbs/mytask 

## The program 'mytask' is run using the mpirun command and three tasks are specified with '-np 3'. This coincides with three MPD daemons that were started in the 'select' statement. The maximum number of ranks that can be launched is the number of entries in $PBS_NODEFILE. 

Interactive Applications

#PBS -I -q wsuq 

## The user requests an interactive job on the wsuq queue. This command opens an interactive shell that is useful for jobs that have graphical interfaces or jobs that require user interaction. 

#PBS -l ncpus=1 

## There is one processor requested

 

cd $TMPDIR

## The user moves to the directory $TMPDIR

cp /wsu/home/zz/zz99/zz9992/pbs/job_name/script_file /.

cp /wsu/home/zz/zz99/zz9992/pbs/job_name/script_file /.

## The user copies the script and data files to the $TMPDIR directory

 

./script_file

## 'script_file' is executed

 

mv $TMPDIR/* /wsu/home/zz/zz99/zz9992/pbs/job_name/completed/.

## The user moves all the files in $TMPDIR to their home directory and the job ends