Overview
What this challenge is about.
Read the system's existing capability spec + tool-allow-list. Design 50+ adversarial inputs across categories: prompt-injection, tool-confusion, scope-escape (agent does something outside intended workflow), goal-hijacking, and infinite-loop bait. Run the campaign against a sandboxed instance. Categorize failures by severity (informational / risky / catastrophic). Propose mitigations (input sanitizers, tool-call validators, action-budget caps, human-in-the-loop gates) and estimate residual risk after each. Write a 6-page audit report for the compliance team.
The Brief
What you'll do, and what you'll demonstrate.
Run an independent safety audit of an agentic workflow and write the report compliance signs off against.
Earning criteria — what you'll demonstrate
- Design a systematic red-team campaign against agentic systems
- Identify common multi-agent / tool-using LLM failure modes
- Propose mitigations with realistic residual-risk reasoning
- Write an audit report that survives compliance review
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
Career paths this builds toward
Canonical rolesAI Safety Researcher
Designing and running red-team campaigns on agentic systems is exactly the work AI safety researchers do at enterprise AI teams and at frontier labs.
This challenge sharpens
- ai-red-teaming
- agent-safety
- prompt-injection
AI Engineer
Hands-on knowledge of agent failure modes and mitigations is the AI-engineer skill set product teams need before shipping agentic features.
This challenge sharpens
- agent-safety
- prompt-injection
- evaluation
AI Solutions Architect
Translating audit findings into mitigation architectures and compliance language is the work AI solutions architects do at regulated customers.
This challenge sharpens
- threat-modeling
- audit-reporting
- agent-safety