Reduce Dimensionality on Sensor Streams for a Mid-Cap Robotics OEM
Overview
What this challenge is about.
You receive 120 robot-hours of windowed sensor data (5s windows, 240 channels) with labels for normal vs. one of four fault classes. Implement (1) PCA, (2) kernel PCA with an RBF kernel, (3) a small autoencoder bottleneck. For each, sweep the latent dimensionality (8, 16, 32, 64) and feed the embeddings into a fixed gradient-boosting classifier. Report downstream macro-F1, training time, and embedding-stability across two random seeds. Recommend one approach plus latent-dim and defend the trade-off in a 2-page memo.
The Brief
What you'll do, and what you'll demonstrate.
Pick the best dimensionality-reduction approach (and latent dim) for a fault-classification pipeline on high-rate sensor streams.
Earning criteria — what you'll demonstrate
- Implement and compare linear, kernelized, and neural dimensionality reduction
- Evaluate embeddings via a downstream task, not just reconstruction error
- Measure embedding stability across seeds — a quiet failure mode
- Defend a choice across accuracy, cost, and stability
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
Machine Learning Engineer
Implementing and comparing dimensionality-reduction methods with downstream-task evaluation is the kind of work MLEs ship for any sensor-heavy product.
This challenge sharpens
- dimensionality-reduction
- autoencoders
- feature-engineering
Applied AI Scientist
Choosing between classical kernel methods and neural alternatives based on real downstream cost is the applied-AI-scientist's daily reality in robotics.
This challenge sharpens
- kernel-methods
- autoencoders
- model-evaluation
Data Engineer
Reasoning about embedding stability and pipeline reproducibility on high-rate sensor streams bridges directly to data-engineering work on feature stores.
This challenge sharpens
- dimensionality-reduction
- feature-engineering
- pytorch