Evaluate Speech Synthesis Voices for an EdTech Storyteller App
Overview
What this challenge is about.
You will generate 60 audio clips (20 per vendor) covering 4 story genres and 3 emotional tones. Recruit 15 native Spanish speakers via a remote panel (Prolific or local equivalent) and run a Mean Opinion Score (MOS, a 1-5 scale rating perceived naturalness) study plus pairwise A/B comparisons on expressiveness. Benchmark per-minute cost and inference latency. Deliver the audio samples, listening-study results, cost-latency table, and a 3-page vendor recommendation memo.
The Brief
What you'll do, and what you'll demonstrate.
Pick the best TTS vendor for Spanish children's storytelling on quality, cost, and latency with a defensible methodology.
Earning criteria — what you'll demonstrate
- Design a listening study that surfaces real perceptual differences
- Compute MOS with honest confidence intervals on small samples
- Combine quality, cost, and latency into a defensible vendor pick
- Translate qualitative perceptual results into a product decision
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
Applied AI Scientist
Running a perceptual study with honest stats and turning it into a vendor pick is exactly the applied-AI scientist's job at consumer-product AI teams.
This challenge sharpens
- tts-evaluation
- listening-studies
- vendor-evaluation
AI Product Manager
Owning a defensible vendor selection that balances quality and cost is a textbook AI PM responsibility.
This challenge sharpens
- vendor-evaluation
- data-storytelling
- mos-scoring
Data Scientist
Listening-study analysis with bootstrap CIs translates directly to broader data-science work on small-sample research.
This challenge sharpens
- listening-studies
- mos-scoring
- data-storytelling