Overview
What this challenge is about.
Take a frozen ResNet-50 (or similar) in ONNX. Compile and benchmark it via TensorRT on Jetson + GPU, ONNX Runtime on all three, OpenVINO on x86 CPU, and IREE on ARM if time allows. Measure p50/p99 latency, throughput, and peak memory on each target with batch sizes 1, 4, and 16. Report compile-time, build-system friction, and one 'gotcha' per stack. Recommend one stack per target with the trade-off table, and write a 4-page memo for the platform team.
The Brief
What you'll do, and what you'll demonstrate.
Pick the best ML compiler per hardware target with a fair benchmark and a memo that defends the choices.
Earning criteria — what you'll demonstrate
- Understand the ML-compiler stack landscape end to end
- Run a fair cross-compiler benchmark on shared hardware
- Quantify the accuracy, latency, and memory trade-offs of compilation
- Recommend a compiler stack per hardware target with evidence
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
ML Researcher
Designing fair cross-compiler benchmarks and writing the trade-off memo is the kind of applied-systems research work that platform-research teams hire for.
This challenge sharpens
- ml-compilers
- benchmarking
- model-evaluation
Machine Learning Engineer
Owning a model's deployment path across multiple hardware targets via compilers is the work MLEs increasingly handle on robotics and edge teams.
This challenge sharpens
- tensorrt
- onnx
- hardware-targeting
MLOps Engineer
Standardizing a compiler stack across hardware targets and writing the per-stack quickstart is the platform-MLOps work that scales an ML team beyond ad-hoc deploys.
This challenge sharpens
- ml-compilers
- benchmarking
- hardware-targeting