Overview
What this challenge is about.
Use a public driving video dataset (e.g., Argoverse 2 sensor or BDD100K) and curate ~6,000 short clips labeled with the three-class intent. Train a temporal model (e.g., a small VideoMAE, TimeSformer, or a CNN + GRU baseline) at two model sizes. Evaluate macro-F1 on a held-out city split (train on city A, test on city B) so you measure generalization, not memorization. Profile on-device latency on a documented GPU proxy (RTX 4060 mobile is fine). Build a confusion matrix sliced by ego-speed band and time-of-day. Produce a 2-page integration note covering accuracy, latency, failure modes, and a confidence-thresholding suggestion for downstream consumers.
The Brief
What you'll do, and what you'll demonstrate.
Train a temporal vision model that classifies neighboring-vehicle lane-change intent from dashcam video and characterize its real-world failure modes for the perception team.
Earning criteria — what you'll demonstrate
- Apply temporal vision models (VideoMAE, TimeSformer, or CNN+GRU) to a short-clip classification task
- Use a cross-city train/test split to measure real-world generalization
- Profile a model for on-device latency and reason about deployment trade-offs
- Communicate a model's failure modes honestly to a downstream consumer team
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
Computer Vision Engineer
Shipping a temporal vision model with city-split evaluation and a deployment-aware latency profile is exactly the day-one task of a CV engineer on a perception team.
This challenge sharpens
- video-understanding
- temporal-modeling
- perception
Machine Learning Engineer
Training a model at two sizes, profiling latency, and writing an integration note for a consumer team mirrors MLE work in any production CV stack.
This challenge sharpens
- model-evaluation
- pytorch
- temporal-modeling
Applied AI Scientist
Building a defensible generalization story (cross-city split, sliced failure modes) and translating it into a threshold recommendation is the texture of applied-AI-scientist work in autonomous vehicles.
This challenge sharpens
- generalization
- model-evaluation
- video-understanding