GPU Computing (AP 278)
Odyssey and GPU Computing on Odyssey
Before proceeding further, make yourself familiar with the basics of Odyssey and GPU computing on Odyssey:
https://rc.fas.harvard.edu/resources/odyssey-quickstart-guide/
https://rc.fas.harvard.edu/resources/documentation/gpgpu-computing-on-odyssey/
CUDA
Compiling and running CUDA code
1) Login to a node with a GPU. Use the holyseasgpu partition (for AP 278).
srun --pty --x11=first -p holyseasgpu --mem 4000 -t 0-10:00 --gres=gpu:1 bash
2) To find out the cuda versions available, In a command window, type:
module-query cuda
3) Load one of the available modules (try cuda/9.2.88 or cuda/8.0):
module load cuda/9.2.88
4) Write or obtain a cuda code. Here are example cuda codes from the following excellent reference (go through the reference to understand the difference between the three versions below):
https://devblogs.nvidia.com/even-easier-introduction-cuda/
5) Compile the code:
To compile, for example add.cu, in a terminal type:
nvcc add.cu -o add_cuda
6) Run the code in interactive mode (test runs only):
./add_cuda
7) Running batch jobs
Create a script (say runscript.sh) to run the executable by copying and pasting the following lines:
#!/bin/bash #SBATCH -p holyseasgpu #Partition to submit to #SBATCH -n 1 #Number of cores #SBATCH --gres=gpu #SBATCH -t 5 #Runtime in minutes #SBATCH --mem-per-cpu=100 #Memory per cpu in MB (see also --mem) module load cuda/9.2.88-fasrc01 time ./add_cuda
4) Run the script in batch mode with:
sbatch runscript.sh
Links on CUDA (with tutorials and sample CUDA programs)
1) On Odyssey, after you load a cuda module you can access sample programs from:
$CUDA_HOME/samples
2) https://devblogs.nvidia.com/even-easier-introduction-cuda/
3) https://devblogs.nvidia.com/easy-introduction-cuda-fortran/
4) https://www.pgroup.com/resources/cudafortran.htm
OpenACC
On Odyssey the PGI OpenACC compiler suite is installed in /n/seasfs03/IACS/ap278/pgi/. To make the compilers (pgcc, pgc++, pgf90, etc.)
available in your path, add the following lines to your .bashrc file (assumes you are using bash, which is the default shell):
export PGI=/n/seasfs03/IACS/ap278/pgi/ export PATH=/n/seasfs03/IACS/ap278/pgi/linux86-64/18.4/bin:$PATH export MANPATH=$MANPATH:/n/seasfs03/IACS/ap278/pgi/linux86-64/18.4/man export LM_LICENSE_FILE=/n/seasfs03/IACS/ap278/pgi/license.dat
Once you add these to your ~/.bashrc, to make these take effect, you can do, in a terminal:
source ~/.bashrc
or
. ~/.bashrc
or you can log out and log back in.
Some useful commands
processor information: nvidia-smi (short), pgaccelinfo (long)
performance profiler: pgprof (For more info: https://www.pgroup.com/resources/docs/18.5/pdf/pgi18profug.pdf)
Compiling and running code with OpenACC directives
You need to first compile your code (say code_acc.c or code_acc.f90) containing OpenACC (see below for example programs).
Note that pgcc and pgf90 should be available in your path for this to succeed (see above for instructions).
For c program:
pgcc -acc code_acc.c -Minfo=accel
For fortran program:
pgf90 -acc code_acc.f90 -Minfo=accel
Slurm Script for running the job on odyssey:
#!/bin/bash #SBATCH -N 1 #Number of nodes #SBATCH -p holyseasgpu #Partition to submit to #SBATCH --ntasks-per-node 2 #SBATCH --gres=gpu:1 #SBATCH -t 5 #Runtime in minutes ./a.out
An OpenACC example
1) Get the sample code (see Ref. 3 and watch the excellent short video tutorial in Ref 3. before working through this tutorial):
git clone https://github.com/parallel-forall/cudacasts cd cudacasts/ep3-first-openacc-program
or
cp -r /n/seasfs03/IACS/ap278/cudacasts/ep3-first-openacc-program/ . cd ep3-first-openacc-program
2) Compile "serial" non-acc code:
pgcc laplace2d.c -o a.out_serial
3) Run the "serial" version and time it:
time ./a.out_serial
4) Compile the code with acc-directives:
pgcc -acc laplace_acc.c -o a.out_acc -Minfo=accel
5) Run the acc-executable:
time ./a.out_acc
Links on OpenACC (with tutorials and sample OpenACC programs)
1) OpenACC example programs
On Odyssey, you can find the OpenACC example programs in:
/n/seasfs03/IACS/ap278/pgi/linux86-64/2018/examples/OpenACC/
2) The following links are very good general references:
https://devblogs.nvidia.com/parallelforall/openacc-example-part-1/
https://devblogs.nvidia.com/openacc-example-part-2/
3) Excellent reference:
https://devblogs.nvidia.com/cudacasts-episode-3-your-first-openacc-program/
(Contains excellent video tutorials. Recommended: The video "Your First OpenACC Program" (7.5 minutes).)
For sample (laplace) code:
https://github.com/parallel-forall/cudacasts
4) Introductory OpenACC tutorial (free, but requires an account):
https://nvidia.qwiklab.com/quests/3?locale=en
5) https://www.openacc.org/get-started
6) https://www.pgroup.com/resources/docs/18.4/x86/openacc-gs/index.htm
7) https://docs.computecanada.ca/wiki/OpenACC_Tutorial
8)http://web.stanford.edu/class/cme213/files/lectures/Lecture_14_openacc2017.pdf
Copyright © 2024 The President and Fellows of Harvard College * Accessibility * Support * Request Access * Terms of Use