Skip to contentSkip to content
Verified credentials. On-chain. Forever.Learn more
Cover image for Tabular Q-Learning for Warehouse Slotting
Code

Tabular Q-Learning for Warehouse Slotting

FreeVerified credential2 weeksIntermediate

Overview

What this challenge is about.

You receive a Python discrete-event simulator with state encoded as a 12-dimensional categorical vector (around 8,000 reachable states) and 6 possible slotting actions, plus 2 years of order-line data to drive simulator demand. Implement tabular Q-learning with epsilon-greedy exploration, train for 200K episodes with logged convergence, and validate on a held-out 3 months of demand. Compare to the existing rule-based slotting on (a) average picker travel per order line and (b) percentage of high-velocity SKUs in golden zones. Success is a 12 percent or better reduction in picker travel with no worsening of the golden-zone metric.

CredentialBlockchain-anchored
ShareableLinkedIn-ready
LanguageEnglish
PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Use tabular Q-learning to learn a warehouse slotting policy that materially beats the current rule-based heuristic on simulated demand.

Earning criteria — what you'll demonstrate

  • Implement tabular Q-learning with epsilon-greedy exploration from scratch
  • Tune exploration schedules and learning rates for convergence
  • Evaluate RL policies against a non-RL baseline on operationally meaningful metrics
  • Communicate RL results to a non-technical operations audience

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

Machine Learning Engineer

Implementing tabular RL on a real operational simulator and beating a rule-based baseline is the kind of MLE win that ships in logistics products.

This challenge sharpens

  • tabular-rl
  • q-learning
  • python

Data Scientist

Validating an RL policy against a heuristic baseline on operationally meaningful metrics is the daily craft of data scientists in logistics and operations roles.

This challenge sharpens

  • policy-evaluation
  • simulation
  • python

Applied AI Scientist

Choosing tabular RL over deep RL because the state space allows it is exactly the kind of judgement applied AI scientists exercise.

This challenge sharpens

  • tabular-rl
  • epsilon-greedy
  • policy-evaluation

One more thing

You can put a credential on your CV by Friday.