Computer & Information Sciences

Data Science Challenges

Real data-science projects and challenges on Ewance — clean messy datasets, build and evaluate models, and turn raw data into decisions the way a working data scientist does. Solve them to build a portfolio of verified, recruiter-checkable proof you can do the work — not just describe it.

Recommended challenges

CodeIntermediateNew
Design a Visual Search Backend for a Boutique Luxury Marketplace
You receive a catalog of 80,000 luxury items (image + sparse metadata) and a labeled query set of 300 user photos with hand-picked target items. Choose an embedding strategy (CL…
- Visual Search
- Embeddings
- Clip
Deep Learning for Computer Vision
ResearchIntermediateNew
Build Saliency-Map Explanations for Dermatology Triage
You receive a trained CNN (ResNet-50 backbone, 7-class lesion classifier) and a 1,000-image held-out test set with dermatologist labels. Implement Integrated Gradients, GradCAM,…
- Saliency Maps
- Integrated Gradients
- Gradcam
Explainable and Interpretable AI
CodeBeginnerNew
Build an Interactive Carbon-Emissions Explorer for a Climate Nonprofit
You receive 20 years of harmonized European industrial emissions data (around 800,000 rows: country, sector, sub-sector, year, emissions in tons CO2e). Design and build an inter…
- Interactive Visualization
- D3
- Observable
Data Visualization
AnalysisBeginnerNew
Sales Performance Analysis for a 40-Person SaaS Scale-Up
You will receive a dataset containing 500+ sales opportunities with fields like deal value, stage, source, close date, and account size. Your challenge is to design a data mart …
- Data Warehousing
- Etl
- Olap
Business Intelligence
Develop in-demand professional skills.
Each challenge names the skills it strengthens. Over time, your profile fills with the competences a hiring manager would actually look for.
Why Ewance
CodeBeginnerNew
Calibrate a Demand Forecast with Bayesian Confidence Intervals
You receive 24 months of weekly demand for 600 SKUs plus the existing XGBoost point predictions. Fit a Bayesian conformal-prediction layer (or, alternatively, a Gaussian-Process…
- Bayesian Inference
- Uncertainty Quantification
- Conformal Prediction
Probabilistic Machine Learning
CodeIntermediateNew
Build an End-to-End ML Pipeline for Loan-Default Prediction
You receive 24 months of historical application + outcome data (about 380,000 rows). Build a pipeline using a workflow orchestrator (Prefect, Kedro, or a simple Makefile chain) …
- Ml Pipelines
- Feature Engineering
- Pipeline Testing
Machine Learning in Practice
CodeIntermediateNew
Implement Model Predictive Control for a Delivery Robot
You receive a kinematic bicycle model of the robot, a published track layout, and 30 minutes of recorded waypoint trajectories. Implement a nonlinear MPC controller using acados…
- Model Predictive Control
- Optimal Control
- Robotics Simulation
Advanced Robotics
ResearchBeginnerNew
Evaluate Speech Synthesis Voices for an EdTech Storyteller App
You will generate 60 audio clips (20 per vendor) covering 4 story genres and 3 emotional tones. Recruit 15 native Spanish speakers via a remote panel (Prolific or local equivale…
- Tts Evaluation
- Listening Studies
- Mos Scoring
Speech Recognition and Spoken Language Processing
Explore role
Pricing Strategist
Set the price that captures value without leaving sales on the table. Demand modelling, willingness-to-pay research, and the disciplined experimentation that turns pricing into a competitive advantage.
Browse challenges
ResearchIntermediateNew
Red-Team Evaluation of a Refusal Policy
You receive the lab's written refusal policy (version 2.3) and a starter set of 60 red-team prompts (10 per category). Extend the set to 240 prompts (40 per category) using docu…
- Red Teaming
- Refusal Policy
- Alignment Evaluation
Machine Learning from Human Preferences (RLHF and Alignment)
CodeIntermediateNew
RAG Faithfulness Evaluation for a Medical-Education Assistant
You receive 200 student-style questions, two RAG configurations (config A: vector-only + GPT-class generator; config B: hybrid + rerank + GPT-class generator), and the medical-t…
- RAG Evaluation
- Faithfulness
- LLM As Judge
Retrieval-Augmented Generation
CodeIntermediateNew
Containerized Model Inference on Kubernetes for a Fintech
You receive a pre-trained credit-risk model (a LightGBM model file) and a sample request payload. Containerize a FastAPI inference service, deploy to EKS or GKE (a single-zone c…
- Kubernetes
- Containerization
- Autoscaling
Cloud Computing for Data and ML
ResearchSeniorNew
SAT-Based Planner for Smart-Grid Demand Response
Encode the dispatch problem (which customers to curtail by how much, respecting per-customer contractual caps and grid-cell totals) as a SAT or MaxSAT instance. Solve 50 histori…
- Sat Based Planning
- Constraint Encoding
- Benchmarking
Automated Planning
Get recognized by recruiters and employers.
Credentials are blockchain-anchored via LearnCoin — tamper-evident, portable, link-shareable on LinkedIn and beyond.
Why Ewance
CodeIntermediateNew
Automate Retraining with a Drift-Triggered MLflow Pipeline
Stand up the pipeline end to end with the team's existing stack (MLflow tracking + model registry, Airflow orchestration). Wire Evidently to compute weekly drift; when drift cro…
- Mlflow
- Airflow
- Data Drift Detection
ML Engineering and Production ML
AnalysisBeginnerNew
Spectral-Analyze Wearable Sleep Data for a Healthtech Pilot
You receive 30 nights of wearable data per 25 volunteers, with polysomnography-derived ground-truth stages (Wake / NREM / REM). Engineer spectral features (delta, theta, alpha, …
- Spectral Analysis
- Feature Engineering
- Wavelet Analysis
Time Series Analysis and Forecasting
CodeBeginnerNew
Build a Video-Question-Answering Demo on a Budget
Pick the model (Video-LLaVA, VideoChat2, or LLaVA-Video) and justify on the A10G budget. Build a Streamlit demo: upload video, ask question, get answer with cited frame timestam…
- Video Language Models
- Multimodal Fusion
- Streamlit
Multimodal Machine Learning
CodeFoundationalNew
Tune a Pick-and-Place Controller for a Cosmetics Co-Packer
You receive 4 hours of logged trajectories from the existing controller (joint positions, target poses, miss/success labels) and read/write access to the controller config (YAML…
- Motion Control
- Trajectory Tuning
- Robot Kinematics
Robotics
CodeIntermediateNew
Defect Detection on PCBs for a Hardware-AI Manufacturer
Use the publicly-available PCB defect dataset (e.g., DeepPCB or HRIPCB). Fine-tune a small object detector (YOLOv8n or RT-DETR-small) on the 6 defect classes. Evaluate mean Aver…
- Object Detection
- Transfer Learning
- Model Evaluation
Computer Vision
CodeSeniorNew
Offline RL for Robot-Arm Skill Reuse
You receive 5,000 logged trajectories (state, action, reward, next-state) across 12 tasks, with 9 tasks for training and 3 held out. Train an offline RL algorithm (CQL or IQL re…
- Offline Rl
- Conservative Q Learning
- Skill Reuse
Robot Learning
ResearchIntermediateNew
Fine-Tune a Vision-Language Model for Image Captioning
Take BLIP-2 or LLaVA-1.6 as the base. Fine-tune (LoRA is fine) on a 4,000-image accessibility-curated dataset where each image has a useful caption written by a low-vision-exper…
- Vision Language Models
- Lora Fine Tuning
- Pytorch
Multimodal Machine Learning
ResearchIntermediateNew
Policy-Gradient Trading Agent on Historical Data
You receive 5 years of daily OHLCV (Open/High/Low/Close/Volume) data for 5 large-cap stocks. Build an episodic environment where each episode is one calendar year and the agent'…
- Policy Gradients
- Reinforce
- Rl Evaluation
Reinforcement Learning
CodeIntermediateNew
Localize a Mobile Robot with Particle-Filter SLAM
You receive 4 ROS bags from real customer plants, each containing 2D LiDAR scans, wheel odometry, and ground-truth poses (from a motion-capture cell used only for evaluation). I…
- State Estimation
- Particle Filter
- Slam
Advanced Robotics
CodeIntermediateNew
Extract Skills and Roles from Job Postings for a Recruiter Tool
You receive 30,000 anonymized job postings and a labelled 1,000-posting benchmark with (skill, role, seniority) spans. Fine-tune a small token classifier (e.g., DeBERTa-v3-base)…
- Information Extraction
- Token Classification
- Esco Taxonomy
Linguistic Engineering and Language Technologies
AnalysisBeginnerNew
Chunking Strategy Bake-Off for Financial Filings
You receive 40 anonymized 10-K filings and 100 labeled questions split into 50 narrative (e.g., 'what is the company's main risk factor?') and 50 numerical (e.g., 'what was oper…
- Document Chunking
- Semantic Chunking
- Layout Aware Chunking
Retrieval-Augmented Generation
ResearchSeniorNew
Multi-Tenant Vector Isolation for a B2B Knowledge Assistant
Build a small proof-of-concept in your chosen vector store (Pinecone or Qdrant — pick one and justify) that supports 10 simulated tenants with 1,000 vectors each. Implement the …
- Multi Tenant Isolation
- Vector Databases
- Threat Modeling
Vector Databases and Embeddings

How it works

From brief to credential, in six steps.

Step 01
Browse challenges aligned to your studies.
Step 02
Accept the one that fits your goals.
Step 03
Work through it with AI Copilot guidance.
Step 04
Submit for structured evaluation.
Step 05
Earn a verified credential.
Step 06
Add it to LinkedIn with one click.

Related fields

Industry teams behind a decade of practitioner briefs

Hiring from this pool?

Sponsor a challenge and meet candidates through actual work.

Industry teams can shape briefs around the skills they hire for, then evaluate students on rubric-scored deliverables — not resumes.

Explore sponsorship

Data Science Projects & Challenges | Ewance