Use Actor-Critic to Auto-Tune a HVAC Control Policy

FreeVerified credential3 weeksAdvanced

Overview

What this challenge is about.

You receive a Sinergym wrapper around the EnergyPlus model of one floor with 8 thermal zones, weather data for one year, and occupancy schedules. Train a Soft Actor-Critic (SAC, a continuous-control off-policy actor-critic algorithm) on temperature setpoints with a reward combining energy use and a comfort penalty (predicted-mean-vote bounds). Evaluate over 4 held-out seasons; report kWh saved vs. rule-based, comfort violation minutes per week, and policy stability across seeds. Write a safety memo explaining failure modes and proposing guard rails for a pilot.

CredentialBlockchain-anchored

ShareableLinkedIn-ready

LanguageEnglish

PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Train a SAC HVAC policy that beats the rule-based controller on energy use while never violating occupant comfort bounds, and propose pilot guard rails.

Earning criteria — what you'll demonstrate

Implement and tune Soft Actor-Critic for continuous control
Design a constrained reward balancing energy and comfort
Evaluate policies seasonally on held-out weather
Translate RL safety considerations into operational guard rails

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Deep Reinforcement Learning

Master · Ai Ml

Fit score: 1

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

Machine Learning Engineer

Training a deep RL controller against a real simulator with strict safety bounds is the kind of system MLEs ship in industrial and building-controls companies.

This challenge sharpens

soft-actor-critic
continuous-control
simulation

AI Safety Researcher

Designing constrained rewards, failure-mode analyses, and operational guard rails for a learned controller is exactly the day-one work of AI safety researchers in applied settings.

This challenge sharpens

safety-constraints
actor-critic
reinforcement-learning

Applied AI Scientist

Translating a research-grade SAC training run into a pilot-ready memo with quantified guard rails is core applied-AI-scientist work.

This challenge sharpens

soft-actor-critic
safety-constraints
simulation

One more thing

You can put a credential on your CV by Friday.

Start this challenge