Policy-Gradient Trading Agent on Historical Data

FreeVerified credential2 weeksAdvanced

Overview

What this challenge is about.

You receive 5 years of daily OHLCV (Open/High/Low/Close/Volume) data for 5 large-cap stocks. Build an episodic environment where each episode is one calendar year and the agent's action is the daily portfolio weights across the 5 stocks. Implement REINFORCE with a learned baseline. Train on years 1-3, validate on year 4, test on year 5 (strictly walk-forward, no leakage). Report annualized return, Sharpe ratio, max drawdown, and turnover, plus a baseline of equal-weight buy-and-hold. Success criterion is HONEST reporting (whether or not the agent beats baseline) plus a written discussion of overfitting risk.

CredentialBlockchain-anchored

ShareableLinkedIn-ready

LanguageEnglish

PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Train a REINFORCE policy-gradient trading agent and report honest walk-forward performance against a buy-and-hold baseline.

Earning criteria — what you'll demonstrate

Derive and implement REINFORCE with a baseline in PyTorch
Design a leak-free walk-forward backtest
Evaluate RL policies with risk-adjusted metrics, not just returns
Practice honest reporting of negative or marginal RL results

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Reinforcement Learning

Master · Ai Ml

Fit score: 1

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

ML Researcher

Implementing a clean REINFORCE study with honest walk-forward reporting is the kind of integrity-first research that quant + research teams hire for.

This challenge sharpens

policy-gradients
reinforce
honest-reporting

Applied AI Scientist

Risk-adjusted RL evaluation and overfitting analysis is core applied-research work in fintech.

This challenge sharpens

rl-evaluation
backtesting
honest-reporting

Research Scientist

Multi-seed reporting and methodological transparency are the rigor signals industrial research-scientist roles look for.

This challenge sharpens

policy-gradients
rl-evaluation
honest-reporting

One more thing

You can put a credential on your CV by Friday.

Start this challenge