Red-Team a Customer-Service Chatbot for Jailbreak Resistance

FreeVerified credential3 weeksAdvanced

Overview

What this challenge is about.

Use a published taxonomy of jailbreak categories (prompt injection, persona override, encoded payloads, multi-turn escalation, refusal bypass, tool-misuse). For each category, design at least 8 attack prompts (so 48+ total). Run them against the chatbot in a controlled environment, score success/failure with a documented rubric, and quantify success rate per category with bootstrap confidence intervals. Produce a 6-page red-team report including 5 worked-example failures, recommended mitigations (input filters, system-prompt hardening, tool gating), and a 1-page exec summary. Do not test on production traffic; use a staging endpoint.

CredentialBlockchain-anchored

ShareableLinkedIn-ready

LanguageEnglish

PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Run a structured red-team campaign on a customer-service LLM chatbot and ship a remediation-ready report.

Earning criteria — what you'll demonstrate

Apply a published jailbreak taxonomy to a real product
Design and score adversarial prompts systematically
Quantify safety with honest statistics, not vibe scores
Translate findings into mitigations engineers can ship

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

AI Safety and Alignment

Master · Ai Ml

Fit score: 1

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

Career paths this builds toward

Canonical roles

AI Safety Researcher
AI Research

AI Safety Researcher

Structured red-teaming with statistics and remediation framing is the AI safety researcher's most marketable craft right now.

This challenge sharpens

red-teaming
jailbreak-analysis
safety-evaluation

Prompt Engineer

Designing attack prompts across a taxonomy is exactly the prompt engineer's adversarial mode of operation.

This challenge sharpens

prompt-engineering
jailbreak-analysis
llm-evaluation

AI Engineer

Translating red-team findings into shippable mitigations (filters, gating, system-prompt hardening) is the AI engineer's daily contribution.

This challenge sharpens

llm-evaluation
prompt-engineering
safety-evaluation

One more thing

You can put a credential on your CV by Friday.

Start this challenge