Skip to contentSkip to content
Verified credentials. On-chain. Forever.Learn more
Cover image for Implement Bulk Synchronous Parallel PageRank on a 1.5B-Edge Graph
Code

Implement Bulk Synchronous Parallel PageRank on a 1.5B-Edge Graph

FreeVerified credential4 weeksAdvanced

Overview

What this challenge is about.

Choose either Apache Spark + GraphX (Pregel API) or a vanilla MPI + C++ implementation. Run 25 iterations of PageRank on the 1.5B-edge graph (graph file format provided: CSR partitioned by source node). Measure per-iteration wall-clock, communication volume per superstep, and weak-scaling efficiency by repeating on 4-node, 8-node, 16-node clusters with proportionally-sized graph subsets. Deliver source, cluster reproduction (Terraform + Spark or MPI setup), benchmark notebook, and a 6-page report covering scaling, where communication becomes the bottleneck, and a recommendation between Spark and MPI for the lab's workload.

CredentialBlockchain-anchored
ShareableLinkedIn-ready
LanguageEnglish
PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Implement and benchmark BSP PageRank on a 1.5B-edge graph across a 16-node cluster with honest weak-scaling efficiency reporting.

Earning criteria — what you'll demonstrate

  • Apply the BSP model to a real iterative graph algorithm
  • Reason about communication volume per superstep
  • Measure weak-scaling efficiency honestly across cluster sizes
  • Trade off framework convenience (Spark) vs raw control (MPI)

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

Career mappings coming soon.

One more thing

You can put a credential on your CV by Friday.

Implement Bulk Synchronous Parallel PageRank on a 1.5B-Edge Graph | Ewance Challenge