Prototype Constitutional-AI Style Guardrails for an Internal Chatbot
Overview
What this challenge is about.
Author a 'constitution' of 15 to 20 principles tailored to internal research use (no IP leakage, no off-label medical claims, no personnel-data fishing, etc.). Implement a critique-and-revise loop using a small open-source model: the model drafts a response, then critiques itself against the constitution, then revises. Evaluate the loop on 30 held-out red-team prompts (you write these). Measure refusal rate, revision rate, and false-positive over-refusal rate. Produce a 5-page deployment readiness document explaining the constitution, the loop, the evaluation, and a 30-day shadow-deployment plan.
The Brief
What you'll do, and what you'll demonstrate.
Prototype a Constitutional-AI-style guardrail system for an internal research chatbot and document deployment readiness.
Earning criteria — what you'll demonstrate
- Author an internal-use constitution for an LLM application
- Implement a critique-and-revise loop in code
- Measure over-refusal as honestly as refusal — both are failures
- Plan a staged shadow-deployment for an AI guardrail system
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
AI Safety Researcher
Building practical alignment scaffolding (constitution + critique loop + over-refusal eval) is exactly the safety researcher's contribution to a deploying org.
This challenge sharpens
- constitutional-ai
- alignment-techniques
- safety-evaluation
Prompt Engineer
Designing critique prompts and measuring their effect on outputs is the prompt engineer's daily craft applied to safety.
This challenge sharpens
- prompt-engineering
- constitutional-ai
- llm-evaluation
AI Solutions Architect
Translating guardrail design into a deployment plan with rollback conditions is the architect's bridge between policy and product.
This challenge sharpens
- ai-governance
- alignment-techniques
- safety-evaluation