Design Parallel I/O for a Climate-Simulation Data Pipeline
Overview
What this challenge is about.
Analyze the current I/O pattern: each MPI rank writes its own file via serial HDF5 (the classic anti-pattern). Design a single shared file using parallel HDF5 + MPI-IO with collective writes, chunked layout aligned to Lustre stripe size, and compression for compressible fields. Implement against the existing simulation framework in Fortran 2008 or C. Benchmark on the cluster's Lustre filesystem at 128, 256, 512 ranks. Deliver source, benchmark report, and a 10-page writeup including the chunking + striping rationale.
The Brief
What you'll do, and what you'll demonstrate.
Replace per-rank serial HDF5 output with parallel HDF5 + MPI-IO collective writes, cutting I/O time to under 15 percent of wall-clock at 512 ranks.
Earning criteria — what you'll demonstrate
- Replace per-rank file output with collective parallel I/O
- Align HDF5 chunk size to Lustre stripe size for max bandwidth
- Use MPI-IO collective writes to amortize metadata cost
- Benchmark parallel I/O at scale and identify the binding constraint
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
Career mappings coming soon.