Overview
What this challenge is about.
Define a small typed query language (filter, aggregate, group_by, time_range, metric). Curate or write 200 training examples covering the controlled subset and 50 held-out test examples. Implement a semantic parser (seq2seq with a constrained decoder, or a grammar-based parser such as Lark + a transformer encoder for argument filling). Compare execution accuracy against an LLM (calling a hosted model is fine for the baseline) on cost, latency, and accuracy. Deliver a 4-page memo with the production-routing recommendation.
The Brief
What you'll do, and what you'll demonstrate.
Build a semantic parser that handles the controlled finance-question subset with higher accuracy + lower cost than the LLM baseline.
Earning criteria — what you'll demonstrate
- Design a typed query intermediate representation
- Train a semantic parser (constrained-decoding or grammar-based)
- Evaluate execution accuracy honestly (not just string-match)
- Reason about routing between a deterministic parser and an LLM
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
NLP Engineer
Semantic parsing on a real enterprise workload is the day-one NLP engineering work at any AI-analytics or knowledge-product company.
This challenge sharpens
- semantic-parsing
- grammar-design
- transformer-models
AI Engineer
Designing the boundary between a deterministic parser and an LLM is the AI-engineer work that controls product cost + latency at scale.
This challenge sharpens
- grammar-design
- semantic-parsing
- evaluation
Applied AI Scientist
Turning a parser/LLM bake-off into a production-routing memo is exactly the applied AI scientist's daily work.
This challenge sharpens
- evaluation
- semantic-parsing
- transformer-models