Computer Science
Site Reliability & Observability Challenges
Site Reliability & Observability challenges put you on the hook for keeping production healthy. You'll build the fundamentals — Application Monitoring, Dashboard Reading, and Performance Analysis — and instrument services with OpenTelemetry instrumentation, Prometheus & Grafana, then define what "healthy" means through Service Level Objectives and SLO / SLI definition.
From there you'll handle the harder edges — Incident command, On-call runbooks, Multi-region failover, and Chaos engineering — the way reliability teams actually operate under pressure. Each challenge you solve earns a verified credential you can share with recruiters.
Recommended Challenges
· Prometheus & Grafana Clear- DesignIntermediateNew
Observability for a Microservices Payments Platform
Design the observability architecture: OpenTelemetry traces from 38 services into Tempo, structured logs via Loki, RED (rate, errors, duration) metrics via Prometheus, SLOs defi…
- Observability
- Opentelemetry Instrumentation
- Slo Design
DevOps and Secure Deployment - DesignIntermediateNew
Design SLO-Driven Alerts for a Telco's Subscriber API
Receive a 90-day RED (Rate, Errors, Duration) metrics export for the subscriber API across 6 endpoints and 38 weeks of paging history. Define an SLO per endpoint (e.g., 99.9 per…
- Slo Design
- Alerting
- Prometheus & Grafana
Software Observability - DesignIntermediateNew
Instrument a Model Monitoring Stack from Scratch
Pick the priority product (recommend the customer-service RAG assistant, around 40k queries/day). Define monitoring signals: input drift (Evidently/NannyML), output quality (LLM…
- Model Monitoring
- Data Drift Detection
- LLM Evaluation
ML Engineering and Production ML - AnalysisBeginnerNew
Define SLOs and Error Budgets for a Real-Time Trading API
Pull 90 days of API latency + error data per endpoint from Prometheus (anonymized exports provided). Propose Service Level Indicators (SLIs) for 3 services × 2 SLI types (availa…
- Slo / Sli Definition
- Error Budgets
- Sli Design
Site Reliability Engineering Practice your coursework on real scenarios.
Every challenge is shaped from real industry context — not generic exercises. The work mirrors what your degree prepares you for.
Why Ewance
- AnalysisIntermediateNew
Debug Latency Tail With Distributed Tracing on a Logistics SaaS
Receive 7 days of anonymized trace data in Tempo, the service map (12 services), and the customer complaint log. Investigate: filter the slowest 1 percent of traces, identify th…
- Distributed Tracing
- Performance Analysis
- Tempo
Software Observability - CodeBeginnerNew
Implement Progressive Delivery with Flagger for an E-Commerce Backend
Install Flagger (or Argo Rollouts) into the existing Kubernetes + Istio stack. Configure canary analysis using Prometheus metrics: request-success-rate, request-duration p99, an…
- Flagger
- Argo Rollouts
- Canary Deployment
GitOps and Continuous Delivery - CodeIntermediateNew
Instrument Network Telemetry for an ISP's Backbone
Receive the backbone topology (12 routers across 4 PoPs, mix of Cisco IOS XR + Juniper Junos), the current SNMP-based monitoring stack, and 4 weeks of customer-complaint tickets…
- Network Telemetry
- Gnmi
- Kafka Event Streaming
Advanced Computer Networks - CodeIntermediateNew
Build a Canary Rollout for a Production Recommender
Pick a serving stack (Triton, Seldon Core, KServe, or BentoML). Implement two-model traffic splitting with a configurable percentage (start at 5%). Wire up online metric collect…
- Canary Deployment
- Kubernetes Orchestration
- A/B Testing
ML Engineering and Production ML - Browse challenges
Explore role
Product Manager
Ship product that solves real user problems. Combine user research, prototyping, and stakeholder alignment to turn ambiguous briefs into measurable wins — the role at the centre of modern software teams.
How it works
From brief to credential, in six steps.
Step 01
Browse challenges aligned to your studies.
Step 02
Accept the one that fits your goals.
Step 03
Work through it with AI Copilot guidance.
Step 04
Submit for structured evaluation.
Step 05
Earn a verified credential.
Step 06
Add it to LinkedIn with one click.
Related skill families
Browse all skillsIndustry teams behind a decade of practitioner briefs
Hiring from this pool?
Sponsor a challenge and meet candidates through actual work.
Industry teams can shape briefs around the skills they hire for, then evaluate students on rubric-scored deliverables — not resumes.



















































































