Cut Latency and Cost on a High-Volume Summarization Service

FreeVerified credential3 weeksAdvanced

Overview

What this challenge is about.

You receive 30 days of anonymized request logs (prompt token counts, completion token counts, latencies, models used). Profile the cost and latency distribution, then design and benchmark four optimizations: (1) prompt compression / system-prompt slimming, (2) routing short articles to a smaller model, (3) request batching where applicable, (4) cache for duplicate articles. Validate quality with a 200-article LLM-as-judge eval (calibrated against 30 human ratings). Deliver: benchmark notebook, recommended changes (PR-style), and a 4-page before/after write-up.

CredentialBlockchain-anchored

ShareableLinkedIn-ready

LanguageEnglish

PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Cut LLM cost 30% and p95 latency to under 1.8 s on a news-summarization service without losing quality.

Earning criteria — what you'll demonstrate

Profile LLM cost and latency distributions from real logs
Apply prompt compression, model tiering, and caching as cost levers
Calibrate LLM-as-judge against human ratings
Communicate optimization trade-offs to product stakeholders

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

LLM Application Development

Master · Ai Ml

Fit score: 1

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

Career paths this builds toward

Canonical roles

AI Engineer
AI Engineering

AI Engineer

Profiling, optimizing, and shipping cost/latency wins on a real LLM service is the day-to-day of AI engineers at scaling AI products.

This challenge sharpens

cost-optimization
latency-optimization
prompt-compression

MLOps Engineer

Model tiering and caching at request-level is core MLOps work on inference platforms.

This challenge sharpens

model-tiering
response-caching
cost-optimization

AI Product Manager

Owning the quality-vs-cost trade-off and the board-facing write-up is the AI PM's daily job.

This challenge sharpens

cost-optimization
llm-evaluation
model-tiering

One more thing

You can put a credential on your CV by Friday.

Start this challenge