Audit a Climate-Tech Sensor Dataset for Production Readiness
Overview
What this challenge is about.
You receive 18 months of raw sensor readings from 1,200 sensors (about 800M rows), plus a sensor-metadata table (location, firmware version, deployment date). Profile the data for: duplicates, time-zone errors, sensor drift (when a sensor's readings slowly diverge from neighbors), and ingest gaps. Quantify how often each issue occurs, which customer reports it affects, and propose a 5-rule data-quality monitoring spec the data engineer can wire into the existing Airflow pipeline. Success is a written audit with prioritized fixes and a one-page monitoring spec the engineer accepts at sprint planning.
The Brief
What you'll do, and what you'll demonstrate.
Audit 18 months of sensor data and propose a prioritized remediation + monitoring plan that catches silent quality issues before they reach customers.
Earning criteria — what you'll demonstrate
- Profile a large, multi-source dataset for systematic quality issues
- Distinguish sensor drift from real environmental change
- Translate audit findings into actionable engineering work
- Design data-quality monitoring that catches issues before customers do
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
Data Engineer
Data audits, drift detection, and writing a monitoring spec are exactly the projects data engineers own when joining a climate or IoT data team.
This challenge sharpens
- data-quality-audit
- monitoring-design
- data-wrangling
MLOps Engineer
Data-quality monitoring is a core MLOps responsibility; this challenge mirrors the discipline of setting up checks that catch issues before models do.
This challenge sharpens
- data-profiling
- monitoring-design
- time-series-analysis
Data Scientist
Understanding sensor drift versus signal is foundational for any data scientist working with real-world IoT data.
This challenge sharpens
- time-series-analysis
- data-profiling
- data-quality-audit