Skip to contentSkip to content
Verified credentials. On-chain. Forever.Learn more
Cover image for Parallelize an Image-Processing Pipeline with Data Parallelism
Code

Parallelize an Image-Processing Pipeline with Data Parallelism

FreeVerified credential1 weekBeginner

Overview

What this challenge is about.

Receive the current pipeline (Python 3.12, ~600 lines, uses Pillow + ffmpeg), a representative batch (1,000 images averaging 3MB each), and host specs (16 cores, 32GB RAM). Rewrite the pipeline using ProcessPoolExecutor with chunked task submission. Tackle the per-process startup cost (load fonts/watermarks once per worker via initializer). Benchmark scaling efficiency from 1 to 16 workers, plot the curve, and identify the point where speedup diverges from linear (Amdahl's-law contribution and per-image-size contribution). Validate output equivalence: every image must hash identically to the serial output. Deliver the parallel pipeline, the benchmark report, the scaling curve, the equivalence-validation script, and a 4-page write-up for the media-platform team.

CredentialBlockchain-anchored
ShareableLinkedIn-ready
LanguageEnglish
PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Rewrite a serial image pipeline using data-parallel execution, measure scaling efficiency from 1 to 16 workers, and validate per-image output equivalence.

Earning criteria — what you'll demonstrate

  • Apply ProcessPoolExecutor with proper initializer + chunksize
  • Measure scaling efficiency vs. ideal speedup
  • Validate output equivalence under parallel execution
  • Identify Amdahl's-law contributions empirically

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

Career mappings coming soon.

One more thing

You can put a credential on your CV by Friday.

Parallelize an Image-Processing Pipeline with Data Parallelism | Ewance Challenge