GPU Computing (AP 278)

Odyssey and GPU Computing on Odyssey

Before proceeding further, make yourself familiar with the basics of Odyssey and GPU computing on Odyssey:




Compiling and running CUDA code

1) Login to a node with a GPU. Use the holyseasgpu partition (for AP 278).

srun --pty --x11=first -p holyseasgpu --mem 4000 -t 0-10:00 --gres=gpu:1 bash

2) To find out the cuda versions available, In a command window, type:

module-query cuda

3) Load one of the available modules (try cuda/9.2.88 or cuda/8.0):

module load cuda/9.2.88

4) Write or obtain a cuda code. Here are example cuda codes from the following excellent reference (go through the reference to understand the difference between the three versions below):





5) Compile the code:

To compile, for example add.cu, in a terminal type:

nvcc add.cu -o add_cuda

6) Run the code in interactive mode (test runs only):


7) Running batch jobs

Create a script (say runscript.sh) to run the executable by copying and pasting the following lines:

#SBATCH -p holyseasgpu #Partition to submit to 
#SBATCH -n 1 #Number of cores 
#SBATCH --gres=gpu
#SBATCH -t 5 #Runtime in minutes 
#SBATCH --mem-per-cpu=100 #Memory per cpu in MB (see also --mem)
module load cuda/9.2.88-fasrc01
time ./add_cuda

4) Run the script in batch mode with:

sbatch runscript.sh

Links on CUDA (with tutorials and sample CUDA programs)

1) On Odyssey, after you load a cuda module you can access sample programs from:


2) https://devblogs.nvidia.com/even-easier-introduction-cuda/

3) https://devblogs.nvidia.com/easy-introduction-cuda-fortran/

4) https://www.pgroup.com/resources/cudafortran.htm


On Odyssey the PGI OpenACC compiler suite is installed in /n/seasfs03/IACS/ap278/pgi/. To make the compilers (pgcc, pgc++, pgf90, etc.)

available in your path, add the following  lines to your .bashrc file (assumes you are using bash, which is the default shell):

export PGI=/n/seasfs03/IACS/ap278/pgi/
export PATH=/n/seasfs03/IACS/ap278/pgi/linux86-64/18.4/bin:$PATH
export MANPATH=$MANPATH:/n/seasfs03/IACS/ap278/pgi/linux86-64/18.4/man
export LM_LICENSE_FILE=/n/seasfs03/IACS/ap278/pgi/license.dat

Once you add these to your ~/.bashrc, to make these take effect, you can do, in a terminal:

source ~/.bashrc


. ~/.bashrc
or you can log out and log back in.

Some useful commands

processor information: nvidia-smi (short), pgaccelinfo (long)

performance profiler: pgprof (For more info: https://www.pgroup.com/resources/docs/18.5/pdf/pgi18profug.pdf)

Compiling and running code with OpenACC directives

You need to first compile your code (say code_acc.c or code_acc.f90) containing OpenACC (see below for example programs).

Note that pgcc and pgf90 should be available in your path for this to succeed (see above for instructions).

For c program:

pgcc -acc code_acc.c -Minfo=accel

For fortran program:

pgf90 -acc code_acc.f90 -Minfo=accel
will create an executable with name a.out. The option -Minfo=accel will display useful information on parallelization.

Slurm Script for running the job on odyssey:

#SBATCH -N 1  #Number of nodes 
#SBATCH -p holyseasgpu  #Partition to submit to 
#SBATCH --ntasks-per-node 2
#SBATCH --gres=gpu:1
#SBATCH -t 5  #Runtime in minutes 

An OpenACC example

1) Get the sample code (see Ref. 3 and watch the excellent short video tutorial in Ref 3. before working through this tutorial):

git clone https://github.com/parallel-forall/cudacasts
cd cudacasts/ep3-first-openacc-program


cp -r /n/seasfs03/IACS/ap278/cudacasts/ep3-first-openacc-program/ .
cd ep3-first-openacc-program

2) Compile "serial" non-acc code:

pgcc laplace2d.c -o a.out_serial

3) Run the "serial" version and time it:

time ./a.out_serial
4) Compile the code with acc-directives:
pgcc -acc laplace_acc.c -o a.out_acc -Minfo=accel

5) Run the acc-executable:

time ./a.out_acc

Links on OpenACC (with tutorials and sample OpenACC programs)

1) OpenACC example programs

    On Odyssey, you can find the OpenACC example programs in:


2The following links are very good general references:




3) Excellent reference:


   (Contains excellent video tutorials. Recommended: The video "Your First OpenACC Program" (7.5 minutes).)

   For sample (laplace) code:


4) Introductory OpenACC tutorial (free, but requires an account):


5) https://www.openacc.org/get-started

6) https://www.pgroup.com/resources/docs/18.4/x86/openacc-gs/index.htm

7) https://docs.computecanada.ca/wiki/OpenACC_Tutorial


