Implement Federated Learning for a Government Statistics Office
Overview
What this challenge is about.
Use Flower as the FL framework. Simulate 8 municipalities each with a partition of a synthetic wage dataset (provided, 1M rows, EU-Labour-Force-Survey schema). Train a gradient-boosted regression model (or small MLP) via federated averaging across 50 rounds. Apply DP-SGD with per-round gradient clipping + Gaussian noise. Measure accuracy vs a centralized baseline; quantify accuracy loss per epsilon level (epsilon = 1, 4, 8). Deliver source, a Jupyter notebook with accuracy/epsilon plots, and a 6-page memo recommending whether FL+DP is viable for the office's cross-municipality use cases.
The Brief
What you'll do, and what you'll demonstrate.
Build a federated-learning + differential-privacy prototype across 8 simulated municipalities and quantify the accuracy/privacy tradeoff.
Earning criteria — what you'll demonstrate
- Implement federated averaging with a real FL framework
- Integrate DP-SGD into federated training loops
- Quantify accuracy vs privacy-budget tradeoffs honestly
- Communicate FL + DP guarantees to a senior statistician audience
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
Career mappings coming soon.