Overview
What this challenge is about.
You receive the current agent prompt, the pen-tester's 60-attack injection test set (direct prompt injection, indirect via doc content, refusal-bypass, and exfiltration), and a Slack thread of the security team's specific demands. Redesign the prompt stack to (a) clearly delimit untrusted doc content with structured tags, (b) add input scanning for known injection patterns, (c) keep the system prompt content non-secret-bearing and abstract, and (d) ensure tool-call arguments cannot be steered by doc content. Evaluate on the 60-attack set; success is at most 2 successful attacks (97 percent block rate) with no degradation on a 100-question benign QA holdout.
The Brief
What you'll do, and what you'll demonstrate.
Harden a support-agent's prompt stack against direct and indirect prompt injection while preserving benign answer quality.
Earning criteria — what you'll demonstrate
- Distinguish direct and indirect prompt-injection attack patterns
- Apply structured-delimiter and input-scanning defenses in a prompt stack
- Run a red-team eval and report findings honestly
- Translate prompt-security work into language a security team can sign off on
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
AI Safety Researcher
Hardening a deployed agent against prompt injection with a red-team eval is the exact entry-level work AI safety researchers do at consumer AI companies.
This challenge sharpens
- prompt-injection-defense
- red-teaming
- security-mindset
Prompt Engineer
Owning the system prompt + delimiter design and proving it with a red-team set is core prompt-engineer territory in any production deployment.
This challenge sharpens
- system-prompt-design
- prompt-evaluation
- input-validation
AI Engineer
Defense-in-depth around a model API plus a regression-aware eval harness is the kind of glue AI engineers ship.
This challenge sharpens
- input-validation
- prompt-evaluation
- system-prompt-design