Train a Reinforcement-Learning Policy for Drone Obstacle Avoidance

FreeVerified credential3 weeksExpert

Overview

What this challenge is about.

You receive a custom Gymnasium drone-flight environment (provided), a baseline hand-engineered controller, and a target evaluation suite covering 4 obstacle densities. Train a PPO policy with vectorized rollouts (Stable-Baselines3 or RLlib) for a fixed compute budget of 24 GPU-hours. Evaluate over 200 rollouts per density, measuring success rate, mean path length, and collision-rate. Compare against the hand-engineered baseline and write the 4-page sim-to-real gap memo that names what's needed to move to a real-drone trial.

CredentialBlockchain-anchored

ShareableLinkedIn-ready

LanguageEnglish

PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Train a PPO obstacle-avoidance policy that beats the hand-engineered baseline across obstacle densities and supports a credible sim-to-real plan.

Earning criteria — what you'll demonstrate

Apply PPO to a continuous-control robotics task end-to-end
Design structured evaluation suites for RL policies
Reason about the sim-to-real gap explicitly
Communicate RL trade-offs to a non-RL audience

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Advanced Robotics

Master · Ai Ml

Fit score: 1

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

Career paths this builds toward

Canonical roles

Machine Learning Engineer
AI Engineering

ML Researcher

End-to-end RL training with structured evaluation and an honest sim-to-real memo is the canonical first project for a junior ML researcher on a robotics team.

This challenge sharpens

reinforcement-learning
ppo
policy-evaluation

Research Scientist

Domain-randomization design and per-condition evaluation discipline are the research-scientist skills that get cited in robotics labs.

This challenge sharpens

reinforcement-learning
sim-to-real
policy-evaluation

Machine Learning Engineer

Reproducible RL training infrastructure with Docker + W&B is the MLE-flavored half of any RL project.

This challenge sharpens

pytorch
robotics-simulation
ppo

One more thing

You can put a credential on your CV by Friday.

Start this challenge