Train a VAE for Synthetic Tabular Data at a Healthtech Startup

FreeVerified credential3 weeksAdvanced

Overview

What this challenge is about.

You receive a synthetic-but-realistic clinical-trial table (around 50,000 patients, 35 columns, mixed continuous and categorical). Train a tabular VAE (or TVAE/CTGAN as alternates). Evaluate utility via downstream model fidelity (train on synthetic, test on real, compare to real-on-real) and privacy via a basic membership-inference attack. Tune the privacy/utility trade-off and recommend a release setting. Deliverable is the trained model, the evaluation report, and a 3-page data-sharing decision memo.

CredentialBlockchain-anchored

ShareableLinkedIn-ready

LanguageEnglish

PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Train a VAE-based synthetic data generator that meets utility and privacy thresholds acceptable for academic data-sharing.

Earning criteria — what you'll demonstrate

Adapt VAE training to mixed-type tabular data
Evaluate synthetic data utility via downstream-model fidelity
Implement and interpret a membership-inference attack
Reason about the privacy/utility trade-off for real data-sharing decisions

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Deep Generative Models

Master · Ai Ml

Fit score: 1

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

Career paths this builds toward

Canonical roles

Research Scientist
AI Research

Research Scientist

Synthetic-data work with formal utility and privacy evaluation is a strong portfolio piece for any privacy-ML or generative research role.

This challenge sharpens

vae
synthetic-data
privacy-evaluation

ML Researcher

Tabular VAEs and their utility/privacy trade-offs are an active research area; this challenge produces a credible first publication-ready artifact.

This challenge sharpens

vae
tabular-generation
utility-evaluation

AI Safety Researcher

Privacy evaluation and membership-inference attacks are core AI safety research methods.

This challenge sharpens

privacy-evaluation
synthetic-data
vae

One more thing

You can put a credential on your CV by Friday.

Start this challenge