Capstone Lab: Diagnose Why a Production Model Quietly Stopped Working
Overview
What this challenge is about.
You receive 6 months of production logs (model inputs, predictions, ground truth from chargebacks) plus the original training data and model card. Reproduce the recall drop in a notebook, run drift diagnostics across features (e.g., PSI, KS test) and across the label distribution, and isolate the most likely root cause. Propose two concrete fixes (one short-term, one structural) and quantify the expected recovery. Write a 3-page postmortem in the team's standard template, including a 'what would have caught this earlier' section.
The Brief
What you'll do, and what you'll demonstrate.
Diagnose a quiet production-model degradation, identify root cause from logs, and write the postmortem the team will actually act on.
Earning criteria — what you'll demonstrate
- Reproduce a production ML failure from logs alone
- Apply drift-detection statistics to real data
- Distinguish data drift, schema change, and concept drift in practice
- Write a postmortem that drives a real fix, not just blame
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
MLOps Engineer
Diagnosing a quiet production degradation, identifying the drift, and writing the postmortem is the bread-and-butter of MLOps work on the on-call side of an ML team.
This challenge sharpens
- data-drift-detection
- model-monitoring
- root-cause-analysis
Machine Learning Engineer
Reproducing failures from logs and proposing structural fixes is the MLE skill that separates engineers who keep models running from those who only ship v1s.
This challenge sharpens
- root-cause-analysis
- feature-engineering
- python
Data Scientist
Drift-detection statistics and chargeback-pipeline reasoning are core data-scientist skills for any team supporting production models.
This challenge sharpens
- data-drift-detection
- feature-engineering
- model-monitoring