User Tools

Site Tools


skill-tree:k:1:3:2:5:1:b

K1.3.2.5.1-B MPI-IO

Background

MPI-IO is an API standard for parallel I/O that allows multiple processes of a parallel program to access data in a common file simultaneously. MPI-IO intends to leverage the relatively wide acceptance of the MPI interface in order to create a similar I/O interface. MPI I/O maps I/O reads and writes to message-passing sends and receives. Accesses to a file can be independent (no coordination between processes takes place) or collective (each process of the group associated with the communicator must participate in the collective access).

MPI-IO supports a high-level interface to describe the partitioning of file data among processes, a collective interface describing complete transfers of global data structures between process memories and files, asynchronous I/O operations, allowing computation to be overlapped with I/O, and optimization of physical file layout on storage devices (disks).

In the MPI-I/O model, each process defines the datatype for its section of the file. These are passed into the MPI-I/O routines and data is automatically read and transferred directly to local memory. There is no single large buffer and no explicit master process. In addition, each read/write access operates on a number of MPI objects which can be of any MPI basic or derived datatype. The data partitioning via MPI derived datatypes has the advantage of added flexibility and expressiveness.

Non-contiguous operations and collective calls have been defined in MPI-IO which lead to a classification of data access into four levels. These levels are characterized by two orthogonal aspects: contiguous vs. non-contiguous data access, and independent vs. collective calls. Depending on the level, a different set of optimizations can be thought of.

Aim

  * To provide a widely used standard for describing parallel I/O operations within an MPI message-passing application.    * To describe independent and collective file I/O operations by processes in a parallel application.    * To improve the performance of a parallel application.

Outcomes

  * Employ MPI derived datatypes for expressing the data layout in the file as well as the partitioning of the file data among the communicator processes.   * Specify high-level information about I/O to the system rather than low-level system-dependent information.   * Collect statistics on the actual read/write operations performed to validate the MPI-IO performance.

Subskills

skill-tree/k/1/3/2/5/1/b.txt · Last modified: 2020/07/19 19:28 by lucy