Hyperparameter Search via CMA-ES for a Pharma QSAR Model
Overview
What this challenge is about.
You receive a labeled QSAR dataset (around 25,000 compounds, regression on a binding-affinity target), a fixed feature pipeline (Morgan fingerprints + descriptors), and the team's gradient-boosting model. Tune 8 hyperparameters (learning rate, depth, leaf count, regularization, subsampling, feature fraction, min-child-weight, num-rounds) under a 200-evaluation budget. Compare CMA-ES (via pycma), random search, and Bayesian optimization (Optuna's TPE) on best score, time-to-95%-of-best, and search-trajectory stability. Recommend a default and write a 2-page memo.
The Brief
What you'll do, and what you'll demonstrate.
Pick the default hyperparameter optimizer for a pharma QSAR AutoML stack by fairly benchmarking CMA-ES, random search, and Bayesian optimization.
Earning criteria — what you'll demonstrate
- Apply CMA-ES to a real ML hyperparameter optimization problem
- Compare metaheuristic search to Bayesian and random baselines fairly
- Quantify search stability across seeds
- Communicate algorithm trade-offs to engineering leadership
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
Applied AI Scientist
Benchmarking optimizers under fair budgets and recommending a default for a production AutoML stack is exactly the work applied AI scientists do at pharma-AI firms.
This challenge sharpens
- cma-es
- hyperparameter-optimization
- benchmarking
Data Scientist
Fair experimental design, variance reporting, and a stakeholder-ready memo transfer directly to data-science roles on any modeling team.
This challenge sharpens
- evaluation
- hyperparameter-optimization
- python
ML Researcher
Stability analysis of search algorithms across seeds is the kind of methodology question ML researchers answer for product teams.
This challenge sharpens
- cma-es
- metaheuristics
- evaluation