Overview
What this challenge is about.
You receive a snapshot of the documentation (Markdown) and 120 real support questions with the URLs of pages containing the answer. Build an open-domain QA pipeline: chunk the docs (300-500 tokens with overlap), embed and index in a vector store, retrieve top-k, optionally rerank, and produce a 2-3 sentence answer with cited URLs. Evaluate on (a) Hits@5 retrieval (does the gold URL appear in top-5?), (b) answer factual accuracy via a 30-question manual rubric, and (c) citation precision (cited URLs actually support the answer). Success is Hits@5 above 85 percent, factual accuracy above 90 percent, citation precision above 90 percent.
The Brief
What you'll do, and what you'll demonstrate.
Build an open-domain QA system over the company's docs that meets retrieval, factual, and citation quality bars for a public beta.
Earning criteria — what you'll demonstrate
- Build an end-to-end open-domain QA pipeline over real documentation
- Apply chunking and embedding strategies for retrieval quality
- Generate cited answers and evaluate citation precision
- Diagnose retrieval vs. generation failures separately
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
Career paths this builds toward
Canonical rolesNLP Engineer
Open-domain QA over real documentation with cited answers is the bread-and-butter shipping skill for NLP engineers at B2B SaaS companies.
This challenge sharpens
- open-domain-qa
- passage-retrieval
- citation-handling
AI Engineer
Glueing embeddings, vector stores, and generation into a beta-ready widget is core AI-engineer work in product-led teams.
This challenge sharpens
- python
- passage-retrieval
- evaluation
Machine Learning Engineer
Separating retrieval vs. generation metrics and shipping a reproducible eval is the kind of MLE discipline hiring teams look for.
This challenge sharpens
- evaluation
- python
- passage-retrieval