HPC Grid Tutorial: How to Run an MPI Job

  1. Log in to the Grid.
  2. From your home directory, copy the MPI tutorial directory to your home directory by typing: cp -R /wsu/el7/scripts/tutorial/mpi_tutorial .

    Image
  3. Go to the mpi_tutorial directory: cd mpi_tutorial

    Edit the job script by typing: vim mpi_python.sh
    Image

    Press i to begin editing, use the arrow keys to move up and down through the file. Change the email to your AccessID and the path to the mpi4.py script with your own.

    Image

    When finished, press Esc and :wq and Enter to save and quit
    Image
  4. Run the MPI job script by typing: sbatch mpi_python.sh

    Image

    Check to see that your job is running by typing: qme

    Image

    This gives you the JOBID, the QOS the job is running on, the state of the job, the job name, how many nodes your job is running on, the number of cores, the memory requested, any features that will specified, the nodelist, and the reason if the job isn't running. You can login to any of the nodes that your job is running on by typing: ssh $HOSTNAME where $HOSTNAME is the name of the node your job is running on. When you are running an MPI Job you should ssh to the last node in the list and check to see if your job is running on it. You can check to see how many resources your job is using on the node by typing: top -u $USER where $USER is your AccessID. Hit q to quit top when you're done.

When your job is finished you should have an error and an output file in your directory.