User Tools

Site Tools


skill-tree:pe:2:3:b

PE2.3 Overview: Profiling tools

Profiling is explained for the CPU level, where it can be supported by hardware performance counters and by sampling techniques.

Sampling is used to see, by examining the program counter, what routines and source code lines of a program are responsible for which portions of the total runtime.

Automatically adding trace code to a parallel program by so-called instrumentation to record its execution in a strict chronology is explained and the difference to profiling is emphasized.

Similar techniques are explained for profiling the network level (e.g. based on InfiniBand counters and I/O server states).

Learning objectives

  • Detect performance issues and bottlenecks caused, for example, by inefficient programming, memory accesses, I/O operations, cache-misses, page-faults, and parallelization overheads.
  • Use environment variables like $IMPISTATS to control the built-in performance analysis functionality in MPI.

Subskills

skill-tree/pe/2/3/b.txt · Last modified: 2024/09/11 12:30 by 127.0.0.1