Triage Medical-Imaging Annotations with a Small Vision Model
Overview
What this challenge is about.
Train a binary normal/abnormal classifier on the public CheXpert or NIH ChestX-ray14 dataset. Use temperature scaling to calibrate the output, then define abstention thresholds so the 'uncertain' bucket contains the cases where the radiologist's time is most valuable. Evaluate triage-throughput gains under a realistic radiologist budget (3 doctors x 6 hours/day x 60 reads/hour) and write a 3-page memo to the head of clinical operations. Emphasize what the model must not be used for.
The Brief
What you'll do, and what you'll demonstrate.
Build a calibrated, abstention-aware triage classifier that doubles effective radiologist throughput on a 30k unlabeled X-ray pile.
Earning criteria — what you'll demonstrate
- Fine-tune a pretrained vision backbone on medical imaging
- Calibrate model outputs and translate them into operational thresholds
- Design an abstention mechanism that maps to human-in-the-loop workflow
- Communicate model boundaries to clinical stakeholders
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
Machine Learning Engineer
Fine-tuning a vision model and shipping a calibrated, threshold-tuned inference script is core MLE work at any imaging-AI company.
This challenge sharpens
- image-classification
- transfer-learning
- ml-pipelines
AI Safety Researcher
Abstention design and explicit non-use documentation are exactly the safety-aware engineering AI safety researchers practice in high-stakes domains.
This challenge sharpens
- calibration
- model-evaluation
- image-classification
Applied AI Scientist
Translating model outputs into a human-in-the-loop workflow that respects clinical realities is applied AI work at its most consequential.
This challenge sharpens
- calibration
- transfer-learning
- model-evaluation