Benchmark Long-Context Architectures on a Legal-Doc Retrieval Task

FreeVerified credential3 weeksExpert

Overview

What this challenge is about.

You receive a public legal-QA dataset (e.g., LongBench's legal split or similar) filtered to documents over 50,000 tokens. Implement or wrap 3 architectures: a sliding-window Transformer baseline, a Mamba-class state-space model, and a hybrid (e.g., Jamba-style). Fine-tune each on the same training split under a shared compute budget, then evaluate on the held-out test split for retrieval accuracy (top-1, top-5) and on long-needle-in-haystack synthetic probes. Write the 6-page technical report following the consultancy's existing style guide.

CredentialBlockchain-anchored

ShareableLinkedIn-ready

LanguageEnglish

PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Determine which long-context architecture family delivers the best accuracy/compute trade-off on real legal documents.

Earning criteria — what you'll demonstrate

Reason about the long-context trade-off space across architecture families
Implement a fair multi-architecture benchmark on a non-toy task
Author a publishable technical report at conference quality
Communicate architecture trade-offs to a non-research audience (lawyers)

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Advanced Deep Learning

Master · Ai Ml

Fit score: 1

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

Career paths this builds toward

Canonical roles

Research Scientist
AI Research

ML Researcher

Cross-architecture comparison with fair-protocol guarantees mirrors the first-year ML-researcher's evaluation discipline.

This challenge sharpens

transformers
state-space-models
benchmarking

NLP Engineer

Hands-on long-context evaluation on real legal documents is a direct skill transfer to NLP engineering roles at legal-tech and enterprise-search companies.

This challenge sharpens

long-context-architectures
transformers
pytorch

One more thing

You can put a credential on your CV by Friday.

Start this challenge