Reservoir Sampling for a Privacy-Preserving Telemetry Pipeline
Overview
What this challenge is about.
Implement Vitter's Algorithm R (and the faster Algorithm L for bonus credit) producing a 90M-event uniform sample per day from a stream of 18B. Add per-key stratification (so low-frequency events aren't drowned by hot ones) using weighted reservoir sampling (A-Res, Efraimidis-Spirakis). Prove uniformity with a chi-square test on a 1-day replay. Deliver a Rust implementation, a 5-page methodology memo, a uniformity proof, and an integration plan for the ingestion gateway.
The Brief
What you'll do, and what you'll demonstrate.
Build a reservoir-sampling pipeline that produces a uniform 0.5 percent sample from an 18B-event/day stream with per-key stratification and provable uniformity.
Earning criteria — what you'll demonstrate
- Implement Vitter's Algorithm R and reason about its uniformity guarantee
- Apply A-Res (Efraimidis-Spirakis) for weighted/stratified reservoir sampling
- Verify sampling uniformity with chi-square goodness-of-fit
- Engineer a streaming sampler that respects backpressure and memory bounds
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
Career mappings coming soon.