Overview
What this challenge is about.
You receive 20 anonymized middle-school essays scored by 2 human teachers on a 4-dimension rubric (structure, evidence, voice, mechanics). Design an LLM-based feedback system that returns: (a) a 4-dim score, (b) 3 strengths, (c) 3 actionable improvements written at the student's grade level. Run the system on all 20 essays and compute inter-rater agreement (Cohen's kappa or weighted-kappa) with both teachers. Deliver the system + a 4-page design memo with a section on what NOT to automate (e.g., assigning grades-of-record).
The Brief
What you'll do, and what you'll demonstrate.
Design a higher-order essay-feedback system with measurable inter-rater agreement and clear human-in-the-loop boundaries.
Earning criteria — what you'll demonstrate
- Design automated-assessment systems with explicit human-in-the-loop boundaries
- Compute inter-rater agreement (Cohen's kappa) correctly
- Translate pedagogy rubrics into LLM prompts
- Communicate AI-feedback limits to parents and teachers
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
AI Product Designer
Designing automated-assessment UX with explicit human-in-the-loop boundaries is core to AI product design in regulated industries like K-12.
This challenge sharpens
- rubric-design
- user-research
- automated-assessment
AI Product Manager
Translating teacher pedagogy + parent transparency into product requirements is the AI PM craft at K-12 companies.
This challenge sharpens
- evaluation-design
- automated-assessment
- user-research