Evaluate Intelligent Agents in a Sustainable-Logistics Simulation
Overview
What this challenge is about.
Build a small grid-world simulation where a delivery agent must complete 10 stops while respecting traffic, charging needs, and time windows. Implement three classical agent designs: purely reactive (table-lookup), model-based (maintains an internal state), and goal-based (plans toward a goal). Run each agent on 50 random scenarios, measure completion rate, distance, and time, and produce a 20-minute talk (slides plus 5-minute live demo) explaining the trade-offs. Audience is policy folks, not engineers — strip jargon ruthlessly.
The Brief
What you'll do, and what you'll demonstrate.
Compare three classical intelligent-agent designs on a last-mile delivery simulation and present results to a public-policy audience.
Earning criteria — what you'll demonstrate
- Implement and compare classical intelligent-agent architectures
- Design a fair simulation tournament with statistical reporting
- Translate agent behavior into policy-audience language
- Run a live demo without it breaking on stage
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
AI Product Manager
Comparing agent designs and presenting trade-offs to a non-technical audience is the AI PM's bread and butter in any AI-policy-adjacent org.
This challenge sharpens
- intelligent-agents
- policy-communication
- comparative-evaluation
AI Engineer
Implementing three classical agent designs and standing up a working simulation is the AI engineer's daily craft.
This challenge sharpens
- intelligent-agents
- simulation-design
- python
Data Scientist
Designing a fair comparative evaluation with honest statistical reporting is the data scientist's lens applied to a simulation tournament.
This challenge sharpens
- comparative-evaluation
- data-visualization
- simulation-design