Video Action Recognition for a Retail Loss-Prevention Startup
Overview
What this challenge is about.
Use a public action-recognition dataset (UCF101 + a small curated retail-action subset; the latter is provided synthetic or you can label 50 short clips). Fine-tune a small backbone (X3D-S or VideoMAE-tiny). Evaluate top-1 accuracy + a strict FP budget (under 1 FP per 1,000 customer-minutes) at recall above 75 percent. Deliver model + evaluation + a 4-page memo addressed jointly to product and legal.
The Brief
What you'll do, and what you'll demonstrate.
Train an action-recognition model that achieves at least 75 percent recall on suspicious actions while staying under 1 FP per 1,000 customer-minutes.
Earning criteria — what you'll demonstrate
- Fine-tune a video action-recognition model on a domain-extended dataset
- Evaluate under a hard FP budget tied to product cost
- Diagnose action-confusion failure modes
- Communicate model boundaries to legal + product stakeholders
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
Machine Learning Engineer
Video action-recognition under a hard FP budget is the day-one MLE work at any retail-AI or surveillance-AI company.
This challenge sharpens
- video-understanding
- action-recognition
- ml-pipelines
Computer Vision Engineer
Action recognition with temporal evaluation is the CV-engineer specialization video product teams hire for.
This challenge sharpens
- action-recognition
- video-understanding
- transfer-learning
AI Safety Researcher
Designing for an FP budget and addressing legal stakeholders explicitly is the safety-aware engineering AI safety researchers practice.
This challenge sharpens
- model-evaluation
- video-understanding
- action-recognition