Data Sciences Challenges
Explore data science challenges on Ewance to build skills employers expect from analysts and ML engineers. Work through challenges on data cleaning, exploratory analysis, modeling, and storytelling with data.
Most Popular
- CodeIntermediateNew
Build a Feature Store Backbone for a Healthtech ML Team
You receive synthetic wearable telemetry (heart rate, accelerometer, sleep stages) for around 5,000 patients across 90 days, plus the existing scattered feature scripts from the…
- Feature Engineering
- Data Modeling
- Python Or Javascript
Data Engineering and Big Data Systems - CodeIntermediateNew
Migrate a Legacy Warehouse to a Lakehouse for an Edtech AI Platform
You receive a Postgres dump of around 50 GB and the current dbt models that produce the student-attempts mart. Land the raw data in object storage (S3 or GCS) as Parquet partiti…
- Lakehouse Architecture
- Delta Lake
- Apache Spark
Data Engineering and Big Data Systems - CodeIntermediateNew
Build a Vector-Search Backend for an Enterprise AI Knowledge Assistant
You receive a corpus of around 20,000 PDFs (mixed scanned and digital) totalling around 30 GB and a labeled retrieval set of 200 queries with human-judged ground-truth passages.…
- RAG Architectures
- Vector Database Basics
- Word Embeddings
Data Engineering and Big Data Systems - CodeIntermediateNew
Build an Anomaly-Detection Pipeline for Pharma Cold-Chain Logistics
You receive 18 months of shipment telemetry (around 60,000 shipments, around 12 million sensor readings) plus a hand-labeled set of 1,200 incidents (mix of true excursions, sens…
- Anomaly Detection
- Feature Engineering
- Time Series Basics
Data Mining and Knowledge Discovery Practice your coursework on real scenarios.
Every challenge is shaped from real-world context — not generic exercises. The work mirrors what your degree prepares you for.
Why Ewance
- CodeIntermediateNew
Detect Coordinated Fraud Rings via Link Analysis at a Neobank
You receive 90 days of account, login, and transaction data (around 1.2 million accounts, around 30 million events) plus a labeled set of 80 known fraud rings. Build a multi-rel…
- Graph Analysis
- Community Detection
- Link Analysis
Data Mining and Knowledge Discovery - CodeIntermediateNew
Build a Hybrid Recommender for a Niche Consumer-AI Music App
You receive listening events (around 240 million plays) plus a content embedding per track (audio + curator tags). Build a collaborative filtering model (ALS or implicit-feedbac…
- Recommender Systems
- Collaborative Filtering
- Content Based Filtering
Data Mining and Knowledge Discovery - DesignIntermediateNew
Visualize Embedding Drift for a RAG Knowledge Assistant
You receive weekly snapshots over 12 weeks of around 50,000 document embeddings each (1024-dim). Design and build a visualization tool that: (a) projects each snapshot to 2D wit…
- Word Embeddings
- Dimensionality Reduction
- Umap
Data Visualization - CodeIntermediateNew
Plan Inventory Replenishment as an MDP for an E-Commerce AI Startup
You receive 18 months of daily demand for 50 representative SKUs at one warehouse plus lead-time and unit-cost data. For one SKU at a time, formulate an MDP with state = (on-han…
- Mdp Modeling
- Value Iteration
- Dynamic Programming
Decision Making Under Uncertainty - Browse challenges
Explore role
Marketing Analyst
Plan and measure campaigns that grow the business. Funnel analytics, attribution, segmentation, and the rigorous measurement that lets marketing defend its budget at the leadership table.
- CodeIntermediateNew
Run a Monte Carlo Tree Search Strategy for a Robotics Pick-and-Place Task
You receive a simulator of the pick-and-place task: a bin with 10 randomly-placed parts, an action space of which part to pick next, and a reward = parts picked per minute with …
- Monte Carlo Tree Search
- Planning
- Simulation
Decision Making Under Uncertainty - AnalysisIntermediateNew
Optimize Stop-Loss Policies with Dynamic Programming at a Quant Fund
You receive five years of daily PnL series for 12 momentum strategies plus a small set of state features (rolling vol, drawdown, regime indicator). Calibrate a discrete Markov m…
- Dynamic Programming
- Backward Induction
- State Modeling
Decision Making Under Uncertainty - AnalysisIntermediateNew
Frame an Energy-Storage Dispatch Decision as a Bayesian Decision Problem
You receive 2 years of hourly spot-price data, 2 years of wind generation data, and a manufacturer's battery degradation model. Frame dispatch as a Bayesian decision problem: mo…
- Bayesian Decision Theory
- Price Modeling
- Back Testing
Decision Making Under Uncertainty - AnalysisIntermediateNew
Simulate Hospital Bed Allocation for a Healthtech Decision Support Pilot
You receive 12 months of anonymized admissions and discharges data plus ward layouts (medicine, surgery, ICU, geriatrics) and a small set of clinical transfer rules. Build a dis…
- Discrete Event Simulation
- Simpy
- Policy Comparison
Decision Support Systems and Decision Analysis Build a verifiable portfolio.
Submissions become evidence. Reviewers with shipping experience score against a rubric; the result becomes a credential anyone can verify.
Why Ewance
- DesignIntermediateNew
Score Compliance Risk for an Enterprise AI Rollout Pipeline
You will design a compliance-risk scoring methodology covering 8 attributes (data residency, model provider, retention policy, PII handling, audit trail, encryption, third-party…
- Risk Scoring
- Compliance Modeling
- Decision Support Systems
Decision Support Systems and Decision Analysis - CodeIntermediateNew
Train a VAE for Synthetic Tabular Data at a Healthtech Startup
You receive a synthetic-but-realistic clinical-trial table (around 50,000 patients, 35 columns, mixed continuous and categorical). Train a tabular VAE (or TVAE/CTGAN as alternat…
- Vae
- Tabular Generation
- Synthetic Data
Deep Generative Models - CodeIntermediateNew
Build a GAN-Based Defect Generator for a Hardware Manufacturing Line
You receive around 60,000 good-unit images and around 380 defective-unit images across 4 defect classes. Train a class-conditional GAN (StyleGAN2-ADA or a smaller alternative fo…
- Gans
- Class Conditional Generation
- Data Augmentation
Deep Generative Models - ResearchIntermediateNew
Prototype a Normalizing Flow for Anomaly Scoring in Climate Sensor Data
You receive 12 months of multivariate sensor traces (8 channels per sensor, hourly). Train a Normalizing Flow (Real NVP or a small Neural Spline Flow) on a clean training window…
- Normalizing Flows
- Density Estimation
- Anomaly Detection
Deep Generative Models - CodeIntermediateNew
Design a Visual Search Backend for a Boutique Luxury Marketplace
You receive a catalog of 80,000 luxury items (image + sparse metadata) and a labeled query set of 300 user photos with hand-picked target items. Choose an embedding strategy (CL…
- Visual Search
- Word Embeddings
- Clip
Deep Learning for Computer Vision - ResearchIntermediateNew
Tune a PPO Policy for an Energy-Storage Trading Bot
You receive 18 months of 15-minute Nordic spot-price data, a battery dynamics model (capacity, round-trip efficiency, degradation curve), and a rule-based baseline that earns ab…
- Policy Gradients
- Ppo
- Reinforcement Learning
Deep Reinforcement Learning - CodeIntermediateNew
Use Actor-Critic to Auto-Tune a HVAC Control Policy
You receive a Sinergym wrapper around the EnergyPlus model of one floor with 8 thermal zones, weather data for one year, and occupancy schedules. Train a Soft Actor-Critic (SAC,…
- Actor Critic
- Soft Actor Critic
- Continuous Control
Deep Reinforcement Learning - AnalysisIntermediateNew
Imitation Learning from Human Demos for a Drone Inspection
You receive 6 hours of expert pilot demonstrations (state-action pairs at 20 Hz) recorded in an AirSim wind-farm environment with 3 turbine designs, plus a held-out 4th turbine …
- Imitation Learning
- Behavioral Cloning
- Dagger
Deep Reinforcement Learning - ResearchIntermediateNew
Hardware-Aware NAS for a Wearable ECG Classifier
You receive a labeled subset of an arrhythmia ECG dataset (about 80,000 10-second windows, 4 classes), a microcontroller latency lookup table (op-level milliseconds) for a Corte…
- Neural Architecture Search
- Hardware Aware Design
- Edge Inference
Edge ML and On-Device Machine Learning - CodeIntermediateNew
Solve a Vehicle-Routing Problem with Tabu Search
You receive a week of anonymized daily VRPTW instances (around 800 orders per day, 120 vehicles, hard delivery windows). Implement tabu search with: a route-insertion constructi…
- Tabu Search
- Metaheuristics
- Vehicle Routing
Evolutionary Computation and Metaheuristic Search - DesignIntermediateNew
Counterfactual Explanations for an Insurance Pricing Model
You receive a trained LightGBM regression model (premium in GBP), the feature schema (28 features, 14 mutable from the customer's side), and 500 sample quotes. Use DiCE (Diverse…
- Counterfactual Explanations
- Dice Ml
- Interpretability
Explainable and Interpretable AI - CodeIntermediateNew
LoRA Fine-Tune a 7B LLM for Legal-Clause Extraction
You receive a curated extraction dataset (2,000 train, 500 val, 500 test contracts with span-level labels across 12 clause types) and a fine-tunable 7B base model (e.g., Llama-3…
- Fine Tuning
- Fine Tuning
- Parameter Efficient Tuning
Fine-Tuning Large Language Models
How it works
From brief to credential, in six steps.
Step 01
Browse challenges aligned to your studies.
Step 02
Accept the one that fits your goals.
Step 03
Work through it with AI Copilot guidance.
Step 04
Submit for structured evaluation.
Step 05
Earn a verified credential.
Step 06
Add it to LinkedIn with one click.
Industry teams behind a decade of practitioner briefs
Hiring from this pool?
Sponsor a challenge and meet candidates through actual work.
Industry teams can shape briefs around the skills they hire for, then evaluate students on rubric-scored deliverables — not resumes.



















































































