Overview
What this challenge is about.
Receive 30 days of anonymized card-transaction events (around 240M events total), the team's existing batch features (cardholder behavior summaries), and a pre-trained fraud-scoring XGBoost model wrapped behind an HTTP endpoint. Build a Flink (or Kafka Streams) job that: (1) ingests transactions from Kafka, (2) joins rolling-window cardholder features (last 1h, 24h, 7d), (3) calls the model for scoring, (4) writes scored events to a downstream Kafka topic with a 500ms p95 end-to-end budget. Replay 30 days at production rate (around 92 events/sec average, 800 events/sec peak). Deliver code, the topology diagram, a latency + throughput report, and a 5-page architecture memo.
The Brief
What you'll do, and what you'll demonstrate.
Build a Kafka + Flink streaming fraud-scoring pipeline that meets 500ms p95 end-to-end latency on 30 days of replayed transactions.
Earning criteria — what you'll demonstrate
- Design a low-latency streaming pipeline with stateful joins
- Implement rolling-window features over a real event stream
- Integrate ML scoring as an enrichment step under a latency budget
- Validate a streaming system against a realistic replay
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
Career mappings coming soon.