$ Revised: Mon Jun 10 2019 by email@example.com
This is an introductory graduate course on parallel computing.
Upon completion, you should
- be able to design and analyze parallel algorithms for a variety of
problems and computational models,
- be familiar with the hardware and software organization of
high-performance parallel computing systems, and
- have experience with the implementation of parallel applications
on high-performance computing systems, and be able to measure,
tune, and report on their performance.
Additional information including the course syllabus can be found in the
- This course will use Piazza to manage class questions and discussions online.
(some material local-access only)
Written and Programming Assignments
- Written Assignments
- Programming Assignments
Platforms and Programming Models
- phaedra is an Intel Xeon E5-2650v4 compute server dedicated to this class with 20 cores
and an attached Nvidia Titan V100 accelerator. COMP 633 students have logins on phaedra.
OpenMP, Cilk, and Cuda programming models are supported.
- longleaf is a research computing cluster with ~350 Intel Xeon E5-2643 nodes providing 24 cores
OpenMP and Cilk programming models are supported on individual nodes. Compute jobs are submitted using slurm.
- dogwood is a research computing cluster with ~240 Intel Xeon E5-2699A nodes providing 44 cores per node.
The MPI programming model is supported to coordinate and communicate among nodes. A subset of the nodes have Intel Xeon Phi (KNL) accelerators.
The individual nodes support MPI, OpenMP, Cilk, OpenACC, and Intel offload (Intel Xeon Phi) programming models.
Compute jobs are submitted using slurm.
- Intel compilers for C/C++ (icc/icpc) support OpenMP 4.5 with tasking
and accelerator offload, and Cilk extensions to C including Cilk array notation.
- On phaedra, source /opt/intel/bin/compilervars.sh (bash) or
source /opt/intel/bin/compilervars.csh (csh)
to access the Intel compilers and tools.
- On research computing clusters use "module add icc" to access Intel compilers.
- Shared memory parallel programming. Specification of the
OpenMP 4.5 API for C/C++ (supported by the Intel compilers).
For a more accessible introduction see the tutorial for OpenMP 3.1 in the Bibliography below.
- Nvidia GPUs: programmed using Cuda C (Compute Capability 7.0 for V100 on phaedra).
- Intel Xeon Phi: C/C++ (or Fortran) with Intel offload directives or OpenMP acc on dogwood.
- MPI reference material
- MPI programs can be submitted to dogwood
This list will evolve throughout the semester. Specific reading
assignments are listed above.
This page is maintained by
Send mail if you find problems.