Build a Dataflow-Based Dead-Code Detector for a Python Monorepo
Overview
What this challenge is about.
Build a Python tool using libcst (or ast + jedi) that constructs a call graph across the monorepo. Account for indirect references (entry points in setup.py / pyproject.toml, dynamic dispatch via getattr, Django URL routes, Celery task registration). Flag functions, classes, and modules with no callers as dead-code candidates. Run on the monorepo and produce a report. Manually validate the top 50 findings; classify into 3 buckets: safe to delete, gated behind a feature flag, false positive (indirect reference missed). Deliver tool source, the candidate report, triage spreadsheet, 5-page methodology report, and a deletion-proposal PR list (under 1K lines per PR for reviewability).
The Brief
What you'll do, and what you'll demonstrate.
Build a dataflow-based dead-code detector for a 600K-line Python monorepo with above 80 percent precision on the top 50 findings.
Earning criteria — what you'll demonstrate
- Apply AST and call-graph analysis to a real-scale Python codebase
- Reason about indirect references that defeat naive call-graph analysis
- Build static-analysis tooling that respects the project's framework conventions
- Communicate static-analysis results to a team that has to actually delete the code
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
Career mappings coming soon.