Skip to contentSkip to content
Verified credentials. On-chain. Forever.Learn more
Cover image for Red-Team a Customer-Service Chatbot for Jailbreak Resistance
Research

Red-Team a Customer-Service Chatbot for Jailbreak Resistance

FreeVerified credential3 weeksAdvanced

Overview

What this challenge is about.

Use a published taxonomy of jailbreak categories (prompt injection, persona override, encoded payloads, multi-turn escalation, refusal bypass, tool-misuse). For each category, design at least 8 attack prompts (so 48+ total). Run them against the chatbot in a controlled environment, score success/failure with a documented rubric, and quantify success rate per category with bootstrap confidence intervals. Produce a 6-page red-team report including 5 worked-example failures, recommended mitigations (input filters, system-prompt hardening, tool gating), and a 1-page exec summary. Do not test on production traffic; use a staging endpoint.

CredentialBlockchain-anchored
ShareableLinkedIn-ready
LanguageEnglish
PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Run a structured red-team campaign on a customer-service LLM chatbot and ship a remediation-ready report.

Earning criteria — what you'll demonstrate

  • Apply a published jailbreak taxonomy to a real product
  • Design and score adversarial prompts systematically
  • Quantify safety with honest statistics, not vibe scores
  • Translate findings into mitigations engineers can ship

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

AI Safety Researcher

Structured red-teaming with statistics and remediation framing is the AI safety researcher's most marketable craft right now.

This challenge sharpens

  • red-teaming
  • jailbreak-analysis
  • safety-evaluation

Prompt Engineer

Designing attack prompts across a taxonomy is exactly the prompt engineer's adversarial mode of operation.

This challenge sharpens

  • prompt-engineering
  • jailbreak-analysis
  • llm-evaluation

AI Engineer

Translating red-team findings into shippable mitigations (filters, gating, system-prompt hardening) is the AI engineer's daily contribution.

This challenge sharpens

  • llm-evaluation
  • prompt-engineering
  • safety-evaluation

One more thing

You can put a credential on your CV by Friday.