Skip to contentSkip to content
Verified credentials. On-chain. Forever.Learn more
Cover image for Safety-Test a Customer-Service Agent for Adversarial Prompts
Research

Safety-Test a Customer-Service Agent for Adversarial Prompts

FreeVerified credential2 weeksAdvanced

Overview

What this challenge is about.

You receive a sandboxed instance of the agent (a tool-using LLM that can read account balances and open support tickets — both mocked). Design a red-team suite of at least 80 prompts across 4 categories: jailbreaks, prompt injection via tool outputs, PII exfiltration attempts, and policy-violating action requests. Run the suite, score the agent's response on a documented rubric, and write a 5-page red-team report with severity-graded findings, reproduction steps, and recommended mitigations.

CredentialBlockchain-anchored
ShareableLinkedIn-ready
LanguageEnglish
PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Surface, score, and document the safety failure modes of a customer-service agent before its production launch.

Earning criteria — what you'll demonstrate

  • Design structured adversarial-prompt suites across multiple risk categories
  • Score LLM outputs against a documented safety rubric
  • Reason about prompt-injection threat models in tool-using agents
  • Communicate red-team findings to a risk-committee audience

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

AI Safety Researcher

Structured red-team work with a severity-graded, CISO-readable report is the entry-level AI-safety-researcher pattern at consultancies and labs.

This challenge sharpens

  • red-teaming
  • adversarial-prompts
  • guardrails

AI Engineer

Designing and shipping a reusable red-team harness is the AI-engineer half of safety work.

This challenge sharpens

  • llm-agents
  • guardrails
  • agent-evaluation

One more thing

You can put a credential on your CV by Friday.