Define SLOs and Error Budgets for a Real-Time Trading API

FreeVerified credential3 weeksIntermediate

Overview

What this challenge is about.

Pull 90 days of API latency + error data per endpoint from Prometheus (anonymized exports provided). Propose Service Level Indicators (SLIs) for 3 services × 2 SLI types (availability + latency). Set candidate SLOs (e.g., 99.95 percent availability over 30 days, p99 latency under 80 ms over 30 days). Simulate: what would the error budget have looked like over the past 90 days under each candidate? Tune until the SLOs are achievable but not trivial (around 30 percent error-budget consumption in a normal month). Deliver an SLO catalog (PDF), error-budget policy (when to pause feature work), and a 1-page exec summary.

CredentialBlockchain-anchored

ShareableLinkedIn-ready

LanguageEnglish

PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Define SLOs and an error-budget policy for the trading API, validated against 90 days of historical data and signed off by engineering + product.

Earning criteria — what you'll demonstrate

Distinguish SLI from SLO and SLA in a real production context
Choose SLIs that reflect user-perceived experience, not just backend metrics
Calibrate SLOs against historical data so they are achievable but meaningful
Write an error-budget policy engineering and product will both honor

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Site Reliability Engineering

Master · Cs Se

Fit score: 1

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

Career mappings coming soon.

One more thing

You can put a credential on your CV by Friday.

Start this challenge