Stress-Test Scalable Oversight on a Tool-Using Agent

FreeVerified credential4 weeksExpert

Overview

What this challenge is about.

Design a sandwich-oversight study: pick a task domain where non-expert oversight is plausible but not trivial (e.g., reviewing data-analysis steps, checking small bug fixes, evaluating short legal summaries). Recruit 6 non-expert reviewers. Have them oversee a tool-using agent on 30 tasks each. Compare to expert ground truth. Measure oversight accuracy, oversight time per task, and where non-experts fail. Vary one factor (e.g., access to the agent's reasoning trace) to isolate its effect. Report results with confidence intervals. Produce an 8-page research report following standard NeurIPS-style structure plus an honest limitations section.

CredentialBlockchain-anchored

ShareableLinkedIn-ready

LanguageEnglish

PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Run a sandwich-style scalable-oversight study on a tool-using agent, isolating the effect of one oversight aid.

Earning criteria — what you'll demonstrate

Design a sandwich-oversight study end-to-end
Recruit and brief non-expert reviewers for a research study
Isolate the effect of one oversight aid via a manipulated variable
Write a publishable-quality research report with an honest limitations section

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

AI Safety and Alignment

Master · Ai Ml

Fit score: 1

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

Career paths this builds toward

Canonical roles

AI Safety Researcher
AI Research

AI Safety Researcher

Scalable-oversight studies are at the literal frontier of alignment research; running one cleanly is a senior-quality hiring signal.

This challenge sharpens

scalable-oversight
alignment-research
experiment-design

ML Researcher

Designing a pre-registered study with a manipulated variable is the ML researcher's quality bar applied to a human-AI setting.

This challenge sharpens

experiment-design
statistical-evaluation
research-writing

Research Scientist

Publishing-quality writeups with honest limitations sections are how junior research scientists earn their first byline.

This challenge sharpens

research-writing
experiment-design
human-evaluation

One more thing

You can put a credential on your CV by Friday.

Start this challenge