ALGORITHMS FOR PARALLEL COMPUTING
Academic year and teacher
If you can't find the course description that you're looking for in the above list,
please see the following instructions >>
- Versione italiana
- Academic year
- 2022/2023
- Teacher
- WALTER BOSCHERI
- Credits
- 6
- Didactic period
- Secondo Semestre
- SSD
- MAT/08
Training objectives
- The main objective of this course is to provide students with the basics of parallel computing, for both distributed and shared memory architectures.
Acquired knowledge will be:
- characteristics of modern architecture for parallel computing;
- metrics for evaluating the performance of parallel codes through computing time measure (speedup, scaling and efficiency);
- introduction to MPI paradigm for distributed memory architectures: point-to-point and group communications, reduction and synchronization directives;
- introduction to OpenMP for shared memory architectures;
- examples of use of widely used libraries for linear algebra, factorization and eigenvalue problems: BLAS, ATLAS, LAPACK, MKL and their parallel implementation;
- examples of use of different profilers for CPU;
-scientific computing for the solution of partial differential equations
The main skills are:
- Analyze serial algorithms and identify a correct workload partitioning methodology to get a relevant speedup;
- Analyze serial algorithms and identify data access patterns to maximize aligned and contiguous memory accesses.
- Analyze code performances when executing on actual architectures. Prerequisites
- Students need to master the following prerequisites, provided by the course of "Numerical Analysis":
- QR factorization, LU factorization and its use for solving linear systems;
- Floating point representation, machine precision, finite arithmetic.
Familiarity with FORTRAN language programming is advised.
Personal notebook is highly recommended. Course programme
- Introduction (10 hours)
- introduction to parallel computing, CPU architecture, computational cost;
- introduction to FORTRAN;
- performance measures (speedup, scaling and efficiency), FORTRAN code to measure serial computational time;
- Example: implementation of conjugate gradient method for the solution of linear systems;
Using MPI (12 hours)
- introduction to MPI, scripts for compiling and launching parallel codes, locking and not-locking point-to-point communications, definition of deadlock, send-receive deadlock and it solution using sendreceive function, collective communications, reductions, synchronization directives;
- Example: implementation of the conjugate gradient method;
Insights
- OpenMP(8 hours)
Examples of parallelization for scientific computing applied to hyperbolic partial differential equations (18 hours):
- heat conduction equation on Cartesian meshes
- free surface Navier-Stokes equations on Cartesian meshes
- parallelization of general unstructured meshes using Metis library Didactic methods
- Main topics will be introduced thanks to slide presentations as well as practical examples.
Possible applications of parallel codes to the supercomputer CINECA. Learning assessment procedures
- Learning objectives level will be measured through an oral test, composed of two parts:
- talk on a project directly developed by the student;
- discussion on some topic of the course.
The instructions for the projects will be forwarded to the students during the teaching period. Reference texts
- - Lecture notes.
- Textbook: V. Kumar, A. Grama, A. Gupta, G. Karypis, Introduction to Parallel Computing: design and analysis of Algorithms, Addison-Wesley, 2003
- Suggested for further study: D.P. Bertsekas, J. Tsisiklis, Parallel and Distributed Computation: Numerical Methods, Trentine-Hall, 1989.