Skip to contentSkip to content
Verified credentials. On-chain. Forever.Learn more
Cover image for Policy-Gradient Trading Agent on Historical Data
Research

Policy-Gradient Trading Agent on Historical Data

FreeVerified credential2 weeksAdvanced

Overview

What this challenge is about.

You receive 5 years of daily OHLCV (Open/High/Low/Close/Volume) data for 5 large-cap stocks. Build an episodic environment where each episode is one calendar year and the agent's action is the daily portfolio weights across the 5 stocks. Implement REINFORCE with a learned baseline. Train on years 1-3, validate on year 4, test on year 5 (strictly walk-forward, no leakage). Report annualized return, Sharpe ratio, max drawdown, and turnover, plus a baseline of equal-weight buy-and-hold. Success criterion is HONEST reporting (whether or not the agent beats baseline) plus a written discussion of overfitting risk.

CredentialBlockchain-anchored
ShareableLinkedIn-ready
LanguageEnglish
PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Train a REINFORCE policy-gradient trading agent and report honest walk-forward performance against a buy-and-hold baseline.

Earning criteria — what you'll demonstrate

  • Derive and implement REINFORCE with a baseline in PyTorch
  • Design a leak-free walk-forward backtest
  • Evaluate RL policies with risk-adjusted metrics, not just returns
  • Practice honest reporting of negative or marginal RL results

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

ML Researcher

Implementing a clean REINFORCE study with honest walk-forward reporting is the kind of integrity-first research that quant + research teams hire for.

This challenge sharpens

  • policy-gradients
  • reinforce
  • honest-reporting

Applied AI Scientist

Risk-adjusted RL evaluation and overfitting analysis is core applied-research work in fintech.

This challenge sharpens

  • rl-evaluation
  • backtesting
  • honest-reporting

Research Scientist

Multi-seed reporting and methodological transparency are the rigor signals industrial research-scientist roles look for.

This challenge sharpens

  • policy-gradients
  • rl-evaluation
  • honest-reporting

One more thing

You can put a credential on your CV by Friday.

Policy-Gradient Trading Agent on Historical Data | Ewance Challenge