Overview
What this challenge is about.
You receive a CSV with about 18,000 student-month rows: features include login frequency, session length, quiz scores, parent app opens, and plan tier. The target is whether the student cancelled within the next 30 days. Split the data chronologically (no random split — leakage risk), train at least three model families (logistic regression, random forest, gradient-boosted trees) with proper regularization, and pick a final model using AUROC (Area Under the ROC Curve) plus a precision-at-top-200 metric the CSM team will actually use. Deliver the ranked list and a one-page memo explaining the top three churn drivers in plain language.
The Brief
What you'll do, and what you'll demonstrate.
Build a churn-prediction model that gives Customer Success a usable ranked list of at-risk students 30 days before cancellation.
Earning criteria — what you'll demonstrate
- Apply supervised learning to a real tabular business problem
- Choose appropriate evaluation metrics for an imbalanced classification task
- Use regularization and cross-validation to avoid overfitting
- Communicate model behaviour to a non-technical stakeholder
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
Data Scientist
Framing a business problem, choosing the right evaluation metric, and shipping a ranked list a non-technical team will use is the day-one job description of a junior data scientist at any subscription business.
This challenge sharpens
- supervised-learning
- model-evaluation
- feature-engineering
Machine Learning Engineer
Wrapping preprocessing and a trained model into a reproducible pipeline is the first step toward shipping an ML system into production.
This challenge sharpens
- python
- gradient-boosting
- feature-engineering
Applied AI Scientist
Comparing model families with calibrated, leakage-free evaluation is the bread-and-butter of applied AI work at product-led startups.
This challenge sharpens
- logistic-regression
- gradient-boosting
- model-evaluation