Skip to contentSkip to content
Verified credentials. On-chain. Forever.Learn more
Cover image for Train a VAE for Synthetic Tabular Data at a Healthtech Startup
Code

Train a VAE for Synthetic Tabular Data at a Healthtech Startup

FreeVerified credential3 weeksAdvanced

Overview

What this challenge is about.

You receive a synthetic-but-realistic clinical-trial table (around 50,000 patients, 35 columns, mixed continuous and categorical). Train a tabular VAE (or TVAE/CTGAN as alternates). Evaluate utility via downstream model fidelity (train on synthetic, test on real, compare to real-on-real) and privacy via a basic membership-inference attack. Tune the privacy/utility trade-off and recommend a release setting. Deliverable is the trained model, the evaluation report, and a 3-page data-sharing decision memo.

CredentialBlockchain-anchored
ShareableLinkedIn-ready
LanguageEnglish
PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Train a VAE-based synthetic data generator that meets utility and privacy thresholds acceptable for academic data-sharing.

Earning criteria — what you'll demonstrate

  • Adapt VAE training to mixed-type tabular data
  • Evaluate synthetic data utility via downstream-model fidelity
  • Implement and interpret a membership-inference attack
  • Reason about the privacy/utility trade-off for real data-sharing decisions

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

Research Scientist

Synthetic-data work with formal utility and privacy evaluation is a strong portfolio piece for any privacy-ML or generative research role.

This challenge sharpens

  • vae
  • synthetic-data
  • privacy-evaluation

ML Researcher

Tabular VAEs and their utility/privacy trade-offs are an active research area; this challenge produces a credible first publication-ready artifact.

This challenge sharpens

  • vae
  • tabular-generation
  • utility-evaluation

AI Safety Researcher

Privacy evaluation and membership-inference attacks are core AI safety research methods.

This challenge sharpens

  • privacy-evaluation
  • synthetic-data
  • vae

One more thing

You can put a credential on your CV by Friday.

Train a VAE for Synthetic Tabular Data at a Healthtech Startup | Ewance Challenge