Model-Based RL for a Robotic Arm Pick-Place Task

FreeVerified credential4 weeksExpert

Overview

What this challenge is about.

You receive a PyBullet pick-and-place environment (Franka Panda arm, 12 object types, randomized starting poses) and a SAC baseline that hits 85% success after about 1.5 million environment steps. Implement a Dreamer-V3 or similar latent-dynamics world model with a learned policy that acts via imagined rollouts. Measure success rate vs. environment steps for both methods up to 1 million steps and seed-average over 5 seeds. Write a 2-page memo on the engineering ROI of MBRL for this task class.

CredentialBlockchain-anchored

ShareableLinkedIn-ready

LanguageEnglish

PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Quantify the sample-efficiency advantage of a model-based RL agent over a strong model-free baseline on a realistic manipulation task.

Earning criteria — what you'll demonstrate

Implement a latent-dynamics world model for control
Compare model-based vs. model-free RL fairly on sample efficiency
Run controlled ablations on world-model hyperparameters
Reason about engineering ROI of complex RL methods in production

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Deep Reinforcement Learning

Master · Ai Ml

Fit score: 1

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

Career paths this builds toward

Canonical roles

Research Scientist
AI Research

Research Scientist

Implementing a recent research method (Dreamer) and running rigorous ablations against a strong baseline is exactly the work expected of a junior research scientist on an RL team.

This challenge sharpens

model-based-rl
world-models
experiment-design

ML Researcher

Sample-efficiency comparisons with proper compute accounting are the kind of practical research questions ML researchers answer for product teams.

This challenge sharpens

reinforcement-learning
experiment-design
pytorch

Applied AI Scientist

Translating an RL research result into an engineering-ROI memo is core applied-AI-scientist work in any robotics company.

This challenge sharpens

model-based-rl
manipulation
world-models

One more thing

You can put a credential on your CV by Friday.

Start this challenge