Improving MPI Collective I/O Performance With Intra-node Request Aggregation

Kang, Qiao; Lee, Sunwoo; Hou, Kai-yuan; Ross, Robert; Agrawal, Ankit; Choudhary, Alok; Liao, Wei-keng

doi:10.1109/TPDS.2020.3000458

Computer Science > Distributed, Parallel, and Cluster Computing

arXiv:1907.12656 (cs)

[Submitted on 29 Jul 2019]

Title:Improving MPI Collective I/O Performance With Intra-node Request Aggregation

Authors:Qiao Kang, Sunwoo Lee, Kai-yuan Hou, Robert Ross, Ankit Agrawal, Alok Choudhary, Wei-keng Liao

View PDF

Abstract:Two-phase I/O is a well-known strategy for implementing collective MPI-IO functions. It redistributes I/O requests among the calling processes into a form that minimizes the file access costs. As modern parallel computers continue to grow into the exascale era, the communication cost of such request redistribution can quickly overwhelm collective I/O performance. This effect has been observed from parallel jobs that run on multiple compute nodes with a high count of MPI processes on each node. To reduce the communication cost, we present a new design for collective I/O by adding an extra communication layer that performs request aggregation among processes within the same compute nodes. This approach can significantly reduce inter-node communication congestion when redistributing the I/O requests. We evaluate the performance and compare with the original two-phase I/O on a Cray XC40 parallel computer with Intel KNL processors. Using I/O patterns from two large-scale production applications and an I/O benchmark, we show the performance improvement of up to 29 times when running 16384 MPI processes on 256 compute nodes.

Comments:	12 pages, 7 figures
Subjects:	Distributed, Parallel, and Cluster Computing (cs.DC)
Cite as:	arXiv:1907.12656 [cs.DC]
	(or arXiv:1907.12656v1 [cs.DC] for this version)
	https://doi.org/10.48550/arXiv.1907.12656
Related DOI:	https://doi.org/10.1109/TPDS.2020.3000458

Submission history

From: Qiao Kang [view email]
[v1] Mon, 29 Jul 2019 21:15:48 UTC (222 KB)

Computer Science > Distributed, Parallel, and Cluster Computing

Title:Improving MPI Collective I/O Performance With Intra-node Request Aggregation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Distributed, Parallel, and Cluster Computing

Title:Improving MPI Collective I/O Performance With Intra-node Request Aggregation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators