Train a Small Diffusion Model for Synthetic Defect Generation
Overview
What this challenge is about.
You receive 2,000 labeled defect images and 18,000 clean weld images. Train a small class-conditional latent diffusion model on the defect images (Hugging Face diffusers is fine). Generate 4,000 synthetic defect samples, then train a fixed downstream defect classifier on (a) real-only, (b) real + synthetic, and (c) synthetic-only. Report downstream F1 on a held-out real test set and a 'realism' qualitative review by 3 reviewers. Recommend whether to ship the synthetic-augmentation pipeline.
The Brief
What you'll do, and what you'll demonstrate.
Decide whether a small diffusion-model-based synthetic-defect generator usefully augments real defect data for downstream classification.
Earning criteria — what you'll demonstrate
- Apply generative perception models (latent diffusion) to a real industrial niche
- Evaluate synthetic data via a downstream task, not just visual inspection
- Compare training regimes (real vs. real+synthetic vs. synthetic-only) honestly
- Recommend an integration path with explicit risk discussion
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
ML Researcher
Training a generative model and rigorously evaluating it via downstream tasks is the kind of end-to-end research story ML-researcher hiring loops grade.
This challenge sharpens
- generative-perception
- diffusion-models
- experiment-design
Computer Vision Engineer
Synthetic-data augmentation pipelines are increasingly common at industrial-AI companies, and shipping one end-to-end is a strong CV-engineer portfolio piece.
This challenge sharpens
- data-augmentation
- convolutional-neural-networks
- pytorch
Applied AI Scientist
Tying a generative method to a measurable downstream metric and recommending an integration path is exactly the applied-AI-scientist's daily craft.
This challenge sharpens
- diffusion-models
- data-augmentation
- experiment-design