Spec Trust-and-Safety Eval Harness for an LLM-Powered Customer-Support Bot

FreeVerified credential3 weeksAdvanced

Overview

What this challenge is about.

You will spec a 6-page evaluation harness covering: (1) jailbreak test set (about 200 prompts across 6 attack families), (2) PII-leakage probes (about 100 synthetic-customer prompts), (3) harmful-output classifier integration (use Detoxify or similar), (4) regression-detection gating (block deploy if any axis regresses beyond a threshold). Produce a reference Python implementation that runs the three axes on a small example bot. Deliver the spec, the reference harness, a 1-page exec summary, and an example nightly-eval report.

CredentialBlockchain-anchored

ShareableLinkedIn-ready

LanguageEnglish

PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Spec and reference-implement a nightly trust-and-safety harness for an LLM customer-support bot covering jailbreaks, PII, and toxicity.

Earning criteria — what you'll demonstrate

Design a multi-axis safety evaluation harness for LLM products
Curate jailbreak and PII test sets at useful scale
Integrate a toxicity classifier into automated gating
Document a harness so engineering can pick it up next sprint

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Trustworthy AI, Robustness, and Safety

Master · Ai Ml

Fit score: 1

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

AI Safety Researcher

Designing a nightly safety eval harness for an LLM product is the AI safety researcher's textbook job at any enterprise-AI vendor.

This challenge sharpens

llm-evaluation
red-teaming
pii-detection

MLOps Engineer

Gating deploys on regression-detection thresholds is the MLOps engineer's craft applied to safety axes.

This challenge sharpens

regression-detection
harness-design
python

Prompt Engineer

Curating jailbreak test sets and reasoning about LLM failure axes is core prompt-engineer territory in safety-conscious product teams.

This challenge sharpens

llm-evaluation
red-teaming
harness-design

One more thing

You can put a credential on your CV by Friday.

Start this challenge