Computer & Information Sciences
Data Science Challenges
Real data-science projects and challenges on Ewance — clean messy datasets, build and evaluate models, and turn raw data into decisions the way a working data scientist does. Solve them to build a portfolio of verified, recruiter-checkable proof you can do the work — not just describe it.
Recommended challenges
- AnalysisBeginnerNew
Audit a Hiring-Screen Classifier for Fairness Across Cohorts
You receive the classifier as a black-box API and a synthetic-but-realistic dataset of 8,000 CVs with imputed demographic proxies (gender, age band, regional cluster) and labele…
- Fairness Evaluation
- Disparate Impact
- Audit Methodology
Trustworthy AI, Robustness, and Safety - CodeSeniorNew
Cost-Optimize a 24/7 LLM API Cluster
Profile the current usage (24-hour trace, per-team breakdown). Pick a cost-optimization mix from: time-based autoscaling, spot/preemptible instances with graceful drain, smarter…
- LLM Serving
- Autoscaling
- Ray
ML Engineering and Production ML - StrategyBeginnerNew
Design an Internal AI-Use Policy for a Mid-Cap Bank
You receive the bank's existing IT-acceptable-use policy and a description of which AI tools are being rolled out (an internal Anthropic Claude wrapper for general use; a code-c…
- Ai Governance Frameworks
- Policy Design
- Responsible Ai
AI Ethics, Fairness, and Responsible AI - CodeIntermediateNew
Fine-Tune ASR for a Healthcare Voice-Note Startup
You receive about 40 hours of de-identified clinician voice notes paired with corrected transcripts plus a medical-terminology lexicon (about 8,000 drug + procedure terms). Fine…
- Asr
- Speech Recognition
- Domain Adaptation
Speech Recognition and Spoken Language Processing Practice your coursework on real scenarios.
Every challenge is shaped from real industry context — not generic exercises. The work mirrors what your degree prepares you for.
Why Ewance
- CodeIntermediateNew
Build a Cross-Lingual Retrieval-Augmented QA System
Index around 5,000 internal-knowledge docs across the three languages using a multilingual embedding model (e.g., multilingual-e5 or BGE-M3). Build the retrieval-then-answer pip…
- RAG Architectures
- Cross Lingual Retrieval
- Multilingual Embeddings
Neural Networks for NLP - CodeIntermediateNew
Few-Shot Defect Classifier for a Fast-Onboarding Industrial AI Vendor
You receive a multi-customer defect dataset (8 historical customers, 4-6 defect classes each). Treat 6 customers as the meta-training set and 2 as the held-out 'new customer' sc…
- Meta Learning
- Few Shot Learning
- Prototypical Networks
Meta-Learning, Transfer Learning, and Multi-Task Learning - CodeIntermediateNew
RAG Faithfulness Evaluation for a Medical-Education Assistant
You receive 200 student-style questions, two RAG configurations (config A: vector-only + GPT-class generator; config B: hybrid + rerank + GPT-class generator), and the medical-t…
- RAG Evaluation
- Faithfulness
- LLM As Judge
Retrieval-Augmented Generation - ResearchSeniorNew
Quantify Sim-to-Real Gap for a Warehouse Manipulation Policy
You receive a trained pick-and-place policy (PyTorch), the simulation env (Isaac Lab), and access to a real-arm rig (or recorded teleop episodes if hardware is unavailable). Def…
- Sim To Real
- Manipulation
- Experimental Design
Robot Perception and Autonomy - Browse challenges
Explore role
Product Manager
Ship product that solves real user problems. Combine user research, prototyping, and stakeholder alignment to turn ambiguous briefs into measurable wins — the role at the centre of modern software teams.
- AnalysisSeniorNew
Brain-Tumor MRI Segmentation Bake-Off
You receive a curated public multi-modal MRI brain-tumor cohort (~600 patients, T1/T1c/T2/FLAIR with whole-tumor / tumor-core / enhancing-tumor masks). Train all three architect…
- Medical Imaging
- Segmentation
- Neural Networks
Machine Learning for Imaging and Medical Image Analysis - DesignSeniorNew
Dynamic Pricing Optimization for a Ride-Hailing Platform
You are a data scientist at CityRide. Using 6 months of historical trip data (pickup/dropoff, time, fare, surge multiplier), weather data, and local events calendar, you must bu…
- Reinforcement Learning
- Optimization
- Simulation
Data Science for Business - CodeIntermediateNew
Reproducible Patient-Cohort Analysis for a Pharma AI Vendor
You receive a written cohort definition (type-2 diabetes patients on metformin for at least 90 days, aged 40-70) and a target output: 12-month HbA1c change distribution plus a K…
- Reproducible Analysis
- Cohort Analysis
- Survival Analysis
Applied Data Analysis and Practical Data Science - AnalysisBeginnerNew
Explain a Credit-Risk Model with SHAP for a Fintech
You receive a trained XGBoost credit-risk model (binary default prediction), the training feature schema (38 features), and a held-out 10,000-sample test set with labels. Comput…
- Shap
- Interpretability
- Fairness Analysis
Explainable and Interpretable AI Build a verifiable portfolio.
Submissions become evidence. Reviewers with shipping experience score against a rubric; the result becomes a credential anyone can verify.
Why Ewance
- ResearchIntermediateNew
Reproduce a Vision-Model Paper Under a Reproducibility Standard
Pick a vision-model paper from CVPR or NeurIPS 2024-2025 with publicly available code and a manageable compute footprint (single-GPU under 24 hours). Reproduce the headline metr…
- Reproducibility
- Experimental Design
- Model Evaluation
AI Measurement and Evaluation - DesignIntermediateNew
Score Compliance Risk for an Enterprise AI Rollout Pipeline
You will design a compliance-risk scoring methodology covering 8 attributes (data residency, model provider, retention policy, PII handling, audit trail, encryption, third-party…
- Risk Scoring
- Compliance Modeling
- Decision Support Systems
Decision Support Systems and Decision Analysis - ResearchIntermediateNew
Disease-Progression Modelling for a Neurodegeneration Biotech
You receive a curated longitudinal Parkinson's cohort (about 1,200 patients, 4-12 visits each, MDS-UPDRS sub-scores, cognitive assessments, demographics). Fit (1) a linear mixed…
- Disease Progression Modeling
- Mixed Effects Models
- State Space Models
Machine Learning for Healthcare and Biomedicine - AnalysisIntermediateNew
Benchmark Approximate Nearest-Neighbor Indexes for a Code-Search Startup
You receive a 5 M-vector sample (768-dim, float32) and a 1,000-query labeled benchmark with ground-truth top-50 neighbors per query. Index the same sample in Chroma (HNSW), Qdra…
- Ann Indexes
- Hnsw
- Benchmarking
Vector Databases and Embeddings - CodeIntermediateNew
Reconstruct a Heritage Facade with Structure-from-Motion
You receive 250 phone photos of the facade plus 6 ground control points measured by a surveyor (used only for metric scaling and validation, not for reconstruction). Run SfM to …
- Structure From Motion
- Multi View Stereo
- 3d Reconstruction
3D Vision and Multi-View Geometry - AnalysisBeginnerNew
Diagnose Query Failures in an E-Commerce Search Box
You receive 6 months of anonymized query logs (~480 million rows): query string, language hint, results-shown count, top-3 product clicks, and add-to-cart events. Build a notebo…
- Query Log Analysis
- Clustering
- Ir Failure Analysis
Information Retrieval and Search - AnalysisBeginnerNew
Map Creator Communities for a Short-Form Video Platform
You receive a 90-day sample of about 4 million creator-creator interactions (duets, mentions, audience overlap) and creator metadata (region, language, content tag). Build a cre…
- Network Analysis
- Community Detection
- Graph Visualization
Social Network Analysis and Web Science - ResearchSeniorNew
Probabilistic Numerics for an ODE-Constrained Battery Model
You receive 12 months of charge/discharge cycle data for 50 battery packs from a delivery-van fleet, plus the existing single-particle ODE degradation model (Python). Use a prob…
- Probabilistic Numerics
- Bayesian Inference
- Ode Modeling
Probabilistic Machine Learning - ResearchSeniorNew
Diffusion-Policy Imitation for Bimanual Cooking Tasks
You receive 300 teleoperated demonstrations of a bimanual pour-and-stir task in a Robomimic-style simulator, deliberately including 2 valid solution modes per task (left-pour-ri…
- Diffusion Policies
- Imitation Learning
- Multimodal Action Distributions
Robot Learning - CodeFoundationalNew
Build a Simple Neural Network to Read Handwritten Postal Codes
You receive a labeled dataset of about 60,000 handwritten digit images (28x28 grayscale) drawn from Indian postal forms. Build two models from scratch in PyTorch: (1) a 2-layer …
- Neural Networks
- Neural Networks
- Pytorch Or Tensorflow
Machine Learning (Undergraduate) - CodeBeginnerNew
Ship a Lightweight ML Microservice for an EdTech Reading App
You receive 3 months of session telemetry (around 50M reading events, child-anonymized). Engineer features per session window, train a small classifier (logistic regression base…
- Feature Engineering
- Model Serving
- Containerization
Applied Machine Learning - ResearchIntermediateNew
Red-Team Evaluation of a Refusal Policy
You receive the lab's written refusal policy (version 2.3) and a starter set of 60 red-team prompts (10 per category). Extend the set to 240 prompts (40 per category) using docu…
- Red Team Operations
- Refusal Policy
- Alignment Evaluation
Machine Learning from Human Preferences (RLHF and Alignment)
How it works
From brief to credential, in six steps.
Step 01
Browse challenges aligned to your studies.
Step 02
Accept the one that fits your goals.
Step 03
Work through it with AI Copilot guidance.
Step 04
Submit for structured evaluation.
Step 05
Earn a verified credential.
Step 06
Add it to LinkedIn with one click.
Industry teams behind a decade of practitioner briefs
Hiring from this pool?
Sponsor a challenge and meet candidates through actual work.
Industry teams can shape briefs around the skills they hire for, then evaluate students on rubric-scored deliverables — not resumes.



















































































