User Tools

Site Tools


skill-tree:k:4:b

**This is an old revision of the document!**

K4-B Job Scheduling

Background

Parallel computers are operated differently than a normal PC, all users must share the system. Therefore, various operative procedures are in place. Users must understand these concepts and procedures to be able to use the available resources of a system to run a parallel application. A workload manager/job scheduler controls how available hardware resources are distributed among the user requests (jobs).

Aim

  • To enable practitioners to comprehend and describe the basic architecture and concepts of resource allocation for an HPC system
  • To provide knowlegde about how workload managers control the unattended background execution of programs or jobs, respectively, by the help of job queues
  • To provide knowlegde about typical scheduling principles (e.g. first come first served, shortest job first) to achieve objectives like minimizing the averaged elapsed program runtimes, and maximizing the utilization of the available HPC resources

Outcomes

  • explain the concepts and procedures for resource allocation and job execution in an HPC environment
  • run interactive jobs and batch jobs
  • comprehend and describe the expected behavior of job scripts
  • change provided job scripts and embed them into shell scripts to run a variety of parallel applications
  • analyze the output generated from a job scheduler and describe the cause of typically generated errors
  • comprehend accounting principles (billing for the jobs)
  • comprehend scheduling strategies that increase productivity

Subskills

skill-tree/k/4/b.1593095099.txt.gz · Last modified: 2020/06/25 16:24 by kai_h