Skip to contentSkip to content
Verified credentials. On-chain. Forever.Learn more
Cover image for Train a Reinforcement-Learning Policy for Drone Obstacle Avoidance
Code

Train a Reinforcement-Learning Policy for Drone Obstacle Avoidance

FreeVerified credential3 weeksExpert

Overview

What this challenge is about.

You receive a custom Gymnasium drone-flight environment (provided), a baseline hand-engineered controller, and a target evaluation suite covering 4 obstacle densities. Train a PPO policy with vectorized rollouts (Stable-Baselines3 or RLlib) for a fixed compute budget of 24 GPU-hours. Evaluate over 200 rollouts per density, measuring success rate, mean path length, and collision-rate. Compare against the hand-engineered baseline and write the 4-page sim-to-real gap memo that names what's needed to move to a real-drone trial.

CredentialBlockchain-anchored
ShareableLinkedIn-ready
LanguageEnglish
PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Train a PPO obstacle-avoidance policy that beats the hand-engineered baseline across obstacle densities and supports a credible sim-to-real plan.

Earning criteria — what you'll demonstrate

  • Apply PPO to a continuous-control robotics task end-to-end
  • Design structured evaluation suites for RL policies
  • Reason about the sim-to-real gap explicitly
  • Communicate RL trade-offs to a non-RL audience

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

ML Researcher

End-to-end RL training with structured evaluation and an honest sim-to-real memo is the canonical first project for a junior ML researcher on a robotics team.

This challenge sharpens

  • reinforcement-learning
  • ppo
  • policy-evaluation

Research Scientist

Domain-randomization design and per-condition evaluation discipline are the research-scientist skills that get cited in robotics labs.

This challenge sharpens

  • reinforcement-learning
  • sim-to-real
  • policy-evaluation

Machine Learning Engineer

Reproducible RL training infrastructure with Docker + W&B is the MLE-flavored half of any RL project.

This challenge sharpens

  • pytorch
  • robotics-simulation
  • ppo

One more thing

You can put a credential on your CV by Friday.

Train a Reinforcement-Learning Policy for Drone Obstacle Avoidance | Ewance Challenge