Computer & Information Sciences

Data Science Challenges

Real data-science projects and challenges on Ewance — clean messy datasets, build and evaluate models, and turn raw data into decisions the way a working data scientist does. Solve them to build a portfolio of verified, recruiter-checkable proof you can do the work — not just describe it.

Recommended challenges

ResearchBeginnerNew
Hyperparameter Search via CMA-ES for a Pharma QSAR Model
You receive a labeled QSAR dataset (around 25,000 compounds, regression on a binding-affinity target), a fixed feature pipeline (Morgan fingerprints + descriptors), and the team…
- Cma Es
- Metaheuristics
- Hyperparameter Optimization
Evolutionary Computation and Metaheuristic Search
CodeIntermediateNew
Migrate a Legacy Warehouse to a Lakehouse for an Edtech AI Platform
You receive a Postgres dump of around 50 GB and the current dbt models that produce the student-attempts mart. Land the raw data in object storage (S3 or GCS) as Parquet partiti…
- Lakehouse Architecture
- Delta Lake
- Spark
Data Engineering and Big Data Systems
DesignSeniorNew
Design a Distributed Training Job for a 13B-Parameter Model
Decide whether to use Fully Sharded Data Parallel (FSDP), Tensor Parallelism, Pipeline Parallelism, or a hybrid; justify against the 13B-param + 32-H100 setup. Calculate memory …
- Distributed Training
- Fsdp
- Pytorch
Machine Learning Systems
ResearchIntermediateNew
Build a Generalization-Bound Tutorial for an MLE Onboarding Track
You will produce a Jupyter-notebook tutorial covering (1) sample-complexity intuition, (2) VC-dimension with worked examples for halfspaces and decision stumps, (3) Rademacher c…
- Statistical Learning Theory
- VC Dimension
- Rademacher Complexity
Statistical Machine Learning
Practice your coursework on real scenarios.
Every challenge is shaped from real-world context — not generic exercises. The work mirrors what your degree prepares you for.
Why Ewance
CodeBeginnerNew
Semantic Segmentation for a Solar-Panel Inspection Drone
Use a publicly-available solar-panel dataset (or the PV-Defect-Detection dataset). Fine-tune a small U-Net or SegFormer-tiny on panel/no-panel pixel-level segmentation. Evaluate…
- Semantic Segmentation
- Cnn
- Transfer Learning
Computer Vision (Undergraduate)
ResearchIntermediateNew
Disease-Progression Modelling for a Neurodegeneration Biotech
You receive a curated longitudinal Parkinson's cohort (about 1,200 patients, 4-12 visits each, MDS-UPDRS sub-scores, cognitive assessments, demographics). Fit (1) a linear mixed…
- Disease Progression Modeling
- Mixed Effects Models
- State Space Models
Machine Learning for Healthcare and Biomedicine
AnalysisBeginnerNew
Build a Reproducible Pricing Analysis for a DTC Skincare Brand
You receive 24 months of order-line data (around 480,000 lines), a Shopify-style customer export, and a discount-code log. Build a Python pipeline that produces: SKU-level price…
- Data Wrangling
- Exploratory Data Analysis
- Cohort Analysis
Applied Data Analysis and Practical Data Science
DesignBeginnerNew
A/B-Test a Recommender Improvement Without Breaking Trust
You receive offline-evaluation results for both the production and candidate models plus aggregate metrics from the last 12 weeks (recipe views, save rate, weekly active users, …
- Experiment Design
- Ab Testing
- Metric Design
Machine Learning in Practice
Explore role
Product Manager
Ship product that solves real user problems. Combine user research, prototyping, and stakeholder alignment to turn ambiguous briefs into measurable wins — the role at the centre of modern software teams.
Browse challenges
CodeIntermediateNew
Multi-Turn Dialogue Manager for a Banking Assistant
You receive a transcript dataset of 200 conversations (human-tagged with intent, slot values, and required outcome), a list of 8 supported intents, and tool stubs for 3 backend …
- Dialogue Management
- Intent Classification
- Slot Filling
Question Answering and Conversational Systems
CodeIntermediateNew
Extractive QA on Clinical Trial Protocols
You receive 500 anonymized protocol PDFs (already OCR-ed to text) and 1,200 labeled question-answer pairs where each answer is an exact text span. Build an extractive QA system:…
- Extractive Qa
- Reading Comprehension
- Model Finetuning
Question Answering and Conversational Systems
AnalysisBeginnerNew
Run an A/B Test on Two System Prompts for a Sales Email Assistant
You will (1) design the A/B test (random assignment by rep_id, 50/50 split, 2-week duration), (2) instrument three primary metrics: reply rate (event-based), average tokens per …
- Prompt Evaluation
- Ab Testing
- Metric Design
LLM Application Development
AnalysisFoundationalNew
Sentiment Analysis for Tel Aviv D2C Cosmetics Brand
You are provided with a dataset of 10,000 customer reviews (in English) with no labels. Your task is to preprocess the text, develop a sentiment classification model using NLP t…
- Text Preprocessing
- Sentiment Analysis
- Classification
Text Analytics and Natural Language Processing
Build a verifiable portfolio.
Submissions become evidence. Reviewers with shipping experience score against a rubric; the result becomes a credential anyone can verify.
Why Ewance
AnalysisBeginnerNew
Community Detection on a Pharma Clinical-Trial Investigator Graph
You receive a pre-fetched dump of around 15,000 trials from a public registry covering oncology over the last 10 years and a mapping of trials to investigator names + institutio…
- Community Detection
- Louvain
- Leiden
Machine Learning on Graphs
CodeIntermediateNew
Reconstruct a Heritage Facade with Structure-from-Motion
You receive 250 phone photos of the facade plus 6 ground control points measured by a surveyor (used only for metric scaling and validation, not for reconstruction). Run SfM to …
- Structure From Motion
- Multi View Stereo
- 3d Reconstruction
3D Vision and Multi-View Geometry
DesignBeginnerNew
Optimizing Inventory for a Toronto D2C Cosmetics Brand
Your task is to design a multidimensional data model (star schema) for inventory management, create an ETL pipeline to load sample data (provided as CSV files), and develop an O…
- Data Warehousing
- Etl
- Olap
Business Intelligence
AnalysisBeginnerNew
Churn Prediction for a Stockholm D2C Cosmetics Brand
You are a data science consultant hired by NordicGlow. Using the provided dataset (synthetic but realistic), you must preprocess the data, engineer features from transaction, cl…
- Data Preprocessing
- Feature Engineering
- Classification
Data Science for Business
CodeBeginnerNew
Build a Real-Time Operations Wall Display for a Logistics AI Startup
You receive a websocket feed of operational events (around 20 events per second) plus a small KPI definition list (throughput per zone, late-truck count, exception queue depth, …
- Realtime Visualization
- Glanceability
- D3
Data Visualization
ResearchIntermediateNew
Train a Physics-Informed Neural Network for Heat Transfer in a Battery Pack
Solve the 2D unsteady heat-conduction equation on a square cell cross-section with a localized source and Dirichlet boundary conditions on the casing. Implement a baseline finit…
- Physics Informed Neural Networks
- Partial Differential Equations
- Pytorch
AI for Science and Engineering
CodeIntermediateNew
DPO Fine-Tune for a Domain-Specific Writing Assistant
You receive a base instruction-tuned model checkpoint plus 2,500 preference pairs from editorial reviews (each pair: two grant-application paragraphs, the editor-preferred winne…
- Dpo
- Preference Learning
- Model Finetuning
Machine Learning from Human Preferences (RLHF and Alignment)
ResearchIntermediateNew
Compare Kernel Methods to Trees on a Genomics Classification Task
You receive a curated benchmark of about 12,000 labeled variants with ~120 numerical + ~40 string features. Fit kernel SVMs (RBF, polynomial, string), random forest, and XGBoost…
- Kernel Methods
- Svm
- Tree Ensembles
Statistical Machine Learning
CodeBeginnerNew
Ship a Lightweight ML Microservice for an EdTech Reading App
You receive 3 months of session telemetry (around 50M reading events, child-anonymized). Engineer features per session window, train a small classifier (logistic regression base…
- Feature Engineering
- Model Serving
- Containerization
Applied Machine Learning
CodeIntermediateNew
Build an Anomaly-Detection Pipeline for Pharma Cold-Chain Logistics
You receive 18 months of shipment telemetry (around 60,000 shipments, around 12 million sensor readings) plus a hand-labeled set of 1,200 incidents (mix of true excursions, sens…
- Anomaly Detection
- Feature Engineering
- Time Series
Data Mining and Knowledge Discovery
CodeIntermediateNew
Quantize a CNN for Battery-Powered Wildlife Cameras at a Climate Nonprofit
You receive an FP32 CNN (MobileNetV2 fine-tuned to 22 species, around 13 MB) and a hold-out test set of 4,000 images. Quantize to int8 (post-training quantization first, then qu…
- Quantization
- Qat
- Edge Deployment
Deep Learning
CodeIntermediateNew
Build a Sequence Model for Sign-Language Word Recognition
You receive about 12,000 short (1-3s) webcam clips covering a 50-word vocabulary, with body+hand pose features pre-extracted (e.g., MediaPipe Holistic landmarks per frame). Buil…
- Sequence Models
- Transformer
- Pose Estimation
Machine Perception

How it works

From brief to credential, in six steps.

Step 01
Browse challenges aligned to your studies.
Step 02
Accept the one that fits your goals.
Step 03
Work through it with AI Copilot guidance.
Step 04
Submit for structured evaluation.
Step 05
Earn a verified credential.
Step 06
Add it to LinkedIn with one click.

Related fields

Industry teams behind a decade of practitioner briefs

Hiring from this pool?

Sponsor a challenge and meet candidates through actual work.

Industry teams can shape briefs around the skills they hire for, then evaluate students on rubric-scored deliverables — not resumes.

Explore sponsorship