Skip to contentSkip to content
Verified credentials. On-chain. Forever.Learn more
Cover image for Build a BM25 + Embeddings Hybrid Search for a Legal-Tech Document Portal
Code

Build a BM25 + Embeddings Hybrid Search for a Legal-Tech Document Portal

FreeVerified credential4 weeksAdvanced

Overview

What this challenge is about.

Stand up an OpenSearch cluster with BM25 indexing on the 2.4M-document corpus. Generate dense embeddings (you choose the model; justify cost and quality trade-offs) and index them in a vector store (OpenSearch's k-NN module is acceptable). Implement a hybrid retriever using reciprocal rank fusion. Curate a 500-query relevance-judgment set with 3 CS team members over 2 weeks. Evaluate BM25, dense-only, and hybrid on MRR (mean reciprocal rank) and Recall@10. Ship the winner behind a feature flag to 10 percent of users for a 1-week telemetry sniff. Deliver code, the eval report, and the rollout plan.

CredentialBlockchain-anchored
ShareableLinkedIn-ready
LanguageEnglish
PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Ship a hybrid BM25-plus-embeddings retrieval system that beats BM25-only on MRR and Recall@10 on a curated 500-query relevance set.

Earning criteria — what you'll demonstrate

  • Implement BM25 and dense retrieval and combine them via reciprocal rank fusion
  • Curate a relevance-judgment set without burning out the CS partners
  • Evaluate retrieval on MRR + Recall and pick a winner defensibly
  • Roll out behind a feature flag and read telemetry honestly

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

Career mappings coming soon.

One more thing

You can put a credential on your CV by Friday.

Build a BM25 + Embeddings Hybrid Search for a Legal-Tech Document Portal | Ewance Challenge