Skip to contentSkip to content
Verified credentials. On-chain. Forever.Learn more
Cover image for Benchmark Approximate Nearest-Neighbor Indexes for a Code-Search Startup
Analysis

Benchmark Approximate Nearest-Neighbor Indexes for a Code-Search Startup

FreeVerified credential2 weeksAdvanced

Overview

What this challenge is about.

You receive a 5 M-vector sample (768-dim, float32) and a 1,000-query labeled benchmark with ground-truth top-50 neighbors per query. Index the same sample in Chroma (HNSW), Qdrant (HNSW), and Weaviate (HNSW), tuned to land within 5 percentage points of recall@10 of each other. Then measure: per-query p50/p95 latency at concurrency 1 and 16, RAM footprint, on-disk size, and index build time. Re-run at 10 M vectors if your VM has the headroom. Write a 3-page memo with one clear recommendation, the trade-off table behind it, and a list of what changes at 200 M.

CredentialBlockchain-anchored
ShareableLinkedIn-ready
LanguageEnglish
PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Pick the production approximate-nearest-neighbor store for a code-search workload by benchmarking Chroma, Qdrant, and Weaviate on recall, latency, RAM, and build time at the same operating point.

Earning criteria — what you'll demonstrate

  • Understand HNSW parameters (M, ef_construction, ef_search) and how they trade quality for latency
  • Design a fair vector-store benchmark at matched recall
  • Project capacity from a 5 M-vector measurement to a 200 M-vector production target
  • Defend an infrastructure recommendation to engineering leadership in writing

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

MLOps Engineer

Picking and sizing the right infra for a vector workload is core MLOps work at any AI-product company scaling past the prototype phase.

This challenge sharpens

  • ann-indexes
  • capacity-planning
  • benchmarking

Data Engineer

Operating vector stores alongside OLTP and warehouse systems is becoming standard data-engineering scope; this challenge gives directly relevant operating experience.

This challenge sharpens

  • vector-databases
  • hnsw
  • capacity-planning

AI Solutions Architect

Translating a benchmark into a written trade-off recommendation that an exec can sign off on is the day-to-day deliverable of an AI solutions architect.

This challenge sharpens

  • benchmarking
  • vector-databases
  • capacity-planning

AI Engineer

Knowing how HNSW parameters move recall and latency is table stakes for any AI engineer shipping retrieval features against a managed vector store.

This challenge sharpens

  • hnsw
  • ann-indexes
  • python

One more thing

You can put a credential on your CV by Friday.