Skip to contentSkip to content
Verified credentials. On-chain. Forever.Learn more
Cover image for Tune a PPO Policy for an Energy-Storage Trading Bot
Research

Tune a PPO Policy for an Energy-Storage Trading Bot

FreeVerified credential3 weeksAdvanced

Overview

What this challenge is about.

You receive 18 months of 15-minute Nordic spot-price data, a battery dynamics model (capacity, round-trip efficiency, degradation curve), and a rule-based baseline that earns about EUR 42/MWh/day. Train a PPO policy in a custom Gym environment to choose charge/discharge actions on a continuous power setpoint. Backtest the trained policy on a held-out 3-month period, report mean daily profit, profit volatility, max drawdown, and percentage of degradation budget consumed. Write a 2-page memo for the trading desk explicitly addressing the risk of overfitting to the training period.

CredentialBlockchain-anchored
ShareableLinkedIn-ready
LanguageEnglish
PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Train and backtest a PPO bidding policy for grid-scale battery storage and quantify whether it beats the rule-based baseline net of degradation and risk.

Earning criteria — what you'll demonstrate

  • Implement and tune Proximal Policy Optimization on a continuous-control problem
  • Design a realistic RL environment around a physical system with degradation costs
  • Backtest a learned policy with held-out time periods to detect overfitting
  • Communicate RL results to a non-ML quant audience

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

Applied AI Scientist

Coupling RL training with rigorous backtests and a trader-facing memo is core applied-AI-scientist work at any quant or climate-tech firm.

This challenge sharpens

  • ppo
  • backtesting
  • risk-analysis

ML Researcher

Designing a faithful RL environment around a physical system with degradation costs is the kind of problem ML researchers tackle in industry research labs.

This challenge sharpens

  • policy-gradients
  • environment-design
  • reinforcement-learning

Data Scientist

Walk-forward evaluation and overfitting analysis on time-series data is the data-scientist craft that transfers to any forecasting or trading role.

This challenge sharpens

  • backtesting
  • risk-analysis
  • environment-design

One more thing

You can put a credential on your CV by Friday.