skill-tree:k:4:1:b
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revision | |||
skill-tree:k:4:1:b [2020/07/14 00:40] – luciana | skill-tree:k:4:1:b [2020/07/19 11:30] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 4: | Line 4: | ||
It covers generic and widely used concepts that serve the purpose to maximize the efficiency of a supercomputer. | It covers generic and widely used concepts that serve the purpose to maximize the efficiency of a supercomputer. | ||
- | Batch jobs submitted to a job queue define the workloads in batch systems. A workload | + | Batch jobs submitted to a job queue define the workloads in batch systems. |
- | manager of a cluster system typically deals with | + | A workload manager of a cluster system typically deals with: |
* Job Control to provide a user interface for submitting jobs to job queues, monitoring their state during processing (e.g. to check their estimated starting time), and intervening in their execution (e.g. to abort them manually) | * Job Control to provide a user interface for submitting jobs to job queues, monitoring their state during processing (e.g. to check their estimated starting time), and intervening in their execution (e.g. to abort them manually) | ||
* Scheduling and Resource Management to select a waiting job for execution and to allocate nodes to the job meeting all its other demands for computing resources (memory, special processing elements like GPUs, etc.) | * Scheduling and Resource Management to select a waiting job for execution and to allocate nodes to the job meeting all its other demands for computing resources (memory, special processing elements like GPUs, etc.) | ||
* Accounting to record historical data about how many computing resources (e.g. computing time) have been consumed by a job | * Accounting to record historical data about how many computing resources (e.g. computing time) have been consumed by a job | ||
- | |||
- | |||
# Aim | # Aim | ||
- | To enable practitioners to comprehend and describe the basic architecture and concepts of resource allocation for an HPC system | + | To enable practitioners to comprehend and describe the basic architecture and concepts of resource allocation for an HPC system. |
# Outcomes | # Outcomes | ||
- | * Comprehend the exclusive and shared usage model in HPC | + | * Comprehend the exclusive and shared usage model in HPC. |
- | * Differentiate batch and interactive job submission | + | * Differentiate batch and interactive job submission. |
- | * Comprehend the generic concepts and architecture of resource manager, scheduler, job and job script | + | * Comprehend the generic concepts and architecture of resource manager, scheduler, job and job script. |
- | * Explain environment variables as a means to communicate | + | * Explain environment variables as a means to communicate. |
- | * Comprehend accounting principles | + | * Comprehend accounting principles. |
- | * Explain the generic steps to run and monitor a single job | + | * Explain the generic steps to run and monitor a single job. |
- | * Comprehend scheduling principles (first come first served, shortest job first, backfilling) to achieve objectives like minimizing the averaged elapsed program runtimes, and maximizing the utilization of the available HPC resources | + | * Comprehend scheduling principles (first come first served, shortest job first, backfilling) to achieve objectives like minimizing the averaged elapsed program runtimes, and maximizing the utilization of the available HPC resources. |
# Subskills | # Subskills | ||
skill-tree/k/4/1/b.1594680018.txt.gz · Last modified: 2020/07/14 00:40 by luciana