Train a Domain-Specific Reranker for a Legal-Tech Search Box
Overview
What this challenge is about.
You receive 20,000 (query, document, relevance-label) triples from the firm's contract corpus. Fine-tune a small cross-encoder (e.g., ms-marco-MiniLM-L-6-v2 or BAAI/bge-reranker-base) on the firm's pairs, evaluate against the generic baseline on a held-out 2,000-triple test set with nDCG@10 and Recall@5, and measure per-query latency on a single CPU. Output the trained model, the benchmark notebook, and a 2-page deployment recommendation covering a top-K-then-rerank architecture and a fallback to the generic reranker if the fine-tuned model fails health checks.
The Brief
What you'll do, and what you'll demonstrate.
Lift contract-search nDCG@10 with a domain-fine-tuned reranker without blowing the latency budget.
Earning criteria — what you'll demonstrate
- Fine-tune a cross-encoder reranker on domain pairs
- Evaluate retrieval+rerank stacks with nDCG and Recall
- Measure inference latency under a deployment-realistic load
- Design a fallback path for ML model failures
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
Career paths this builds toward
Canonical rolesNLP Engineer
Owning a domain-fine-tuned reranker end-to-end is core NLP-engineer work at legal-tech, biomedical, and any vertical search vendor.
This challenge sharpens
- cross-encoder-reranker
- fine-tuning
- transformers
Machine Learning Engineer
Latency-aware deployment design with a fallback plan is the MLE skillset that production ML teams hire for.
This challenge sharpens
- latency-benchmarking
- deployment-design
- fine-tuning
AI Engineer
Wiring a fine-tuned reranker into a top-K-then-rerank architecture is the kind of glue-engineering AI engineers do at any AI startup.
This challenge sharpens
- cross-encoder-reranker
- deployment-design
- ir-evaluation