AI & Data
Statistics & Data Science Methods Challenges
Statistics & Data Science Methods challenges put you inside the work of drawing trustworthy conclusions from data. You'll build Statistics Fundamentals and Statistical Analysis, run Exploratory Data Analysis, Hypothesis Testing, Confidence Intervals, and Linear Regression, and design clean Sampling Methods.
From there you'll handle the harder edges — Bayesian methods, Causal inference, A/B testing with statistical significance, Monte Carlo Simulation, and Uncertainty Quantification — applying Experimental design the way data scientists actually do. Each challenge you solve earns a verified credential you can share with recruiters.
Recommended Challenges
· Experimental design Clear- All
- Data Analysis
- Experimental design
- Simulation
- Exploratory Data Analysis
- Statistical Analysis
- Uncertainty Quantification
- Logistic regression
- Cost Modeling
- Hypothesis Testing
- Monte Carlo Simulation
- A/B testing with statistical significance
- Linear Regression
- Time series basics
- Bayesian methods
- Causal inference
- Sampling Methods
- ResearchSeniorNew
Long-Context QA Evaluation Benchmark for Legal Memoranda
You receive 25 anonymized legal memoranda (50-90 pages each) and 100 QA pairs whose answers are deliberately spread across the documents (25 in pages 1-20, 25 in pages 20-40, 25…
- Long Context Qa
- Benchmark Design
- Model Evaluation
Question Answering and Conversational Systems - ResearchSeniorNew
Train Cooperative Agents with Multi-Agent RL
Pick an open multi-agent environment (PettingZoo's MPE 'simple_spread', Overcooked-AI, or SMAC). Implement or wrap three methods: IPPO (independent PPO per agent), MAPPO (centra…
- Multi Agent Reinforcement Learning
- Ppo
- Pytorch Or Tensorflow
Multi-Agent Systems - ResearchBeginnerNew
Evaluate a Generative AI Image Tool with a Within-Subjects Study
You will write a study protocol, recruit 20 participants (a Discord callout is fine), counterbalance the two conditions, and run 45-minute sessions over Zoom. Collect three meas…
- Experimental Design
- User Study
- Within Subjects Design
Human-Computer Interaction for AI Systems - ResearchSeniorNew
SAT-Based Planner for Smart-Grid Demand Response
Encode the dispatch problem (which customers to curtail by how much, respecting per-customer contractual caps and grid-cell totals) as a SAT or MaxSAT instance. Solve 50 histori…
- Sat Based Planning
- Constraint Encoding
- Benchmarking
Automated Planning Practice your coursework on real scenarios.
Every challenge is shaped from real industry context — not generic exercises. The work mirrors what your degree prepares you for.
Why Ewance
- ResearchSeniorNew
Plan a Parameter-Efficient Fine-Tuning Strategy for a Big-Tech AI Lab
You will produce (1) a 6-page survey of four PEFT methods (LoRA, adapters, prefix tuning, IA3) with their strengths, weaknesses, and parameter footprints, (2) a one-page decisio…
- Parameter Efficient Fine Tuning
- Transfer Learning
- Fine Tuning
Meta-Learning, Transfer Learning, and Multi-Task Learning - AnalysisBeginnerNew
Chunking Strategy Bake-Off for Financial Filings
You receive 40 anonymized 10-K filings and 100 labeled questions split into 50 narrative (e.g., 'what is the company's main risk factor?') and 50 numerical (e.g., 'what was oper…
- Document Chunking
- Semantic Chunking
- Layout Aware Chunking
Retrieval-Augmented Generation - ResearchIntermediateNew
Run a Perceptual Study on Color Scales for a Climate Risk Map
Design a remote study (Prolific or similar, 60 participants screened for normal color vision via Ishihara plates online) with 3 task types: (1) value estimation, (2) anomaly det…
- Perceptual Study
- Color Scales
- Experimental Design
Information and Data Visualization - ResearchSeniorNew
Open-Vocabulary Segmentation Benchmark for a Robotics R&D Lab
Use a curated 200-image household scene set (publicly-available HM3D renderings or COCO + a handful of household prompts). Benchmark 3 open-vocabulary segmentation models: SAM +…
- Open Vocabulary Segmentation
- Vision Language Models
- Benchmarking
Computer Vision - Browse challenges
Explore role
Product Manager
Ship product that solves real user problems. Combine user research, prototyping, and stakeholder alignment to turn ambiguous briefs into measurable wins — the role at the centre of modern software teams.
- ResearchSeniorNew
Reproduce a Mechanistic Interpretability Result on a Small Transformer
Pick a published mechanistic-interpretability paper that operates on a small (under 1 billion parameter) open-source transformer (e.g., GPT-2 small, Pythia 70M). Set up the envi…
- Mechanistic Interpretability
- Transformer Internals
- Pytorch Or Tensorflow
AI Safety and Alignment - CodeSeniorNew
Auto-Tune a Distributed Training Cluster's Throughput
Pick a representative fine-tune job (an open 7B model on a public instruction dataset is fine). Define the search space: NCCL_ALGO, NCCL_PROTO, num_workers, prefetch_factor, gra…
- Distributed Training
- Hyperparameter Tuning
- Nccl
Machine Learning Systems - ResearchIntermediateNew
Hands-on Lab: Reproduce a Recent SOTA Vision Paper
Pick one of three pre-approved 2025 papers (offered by the supervisor) with a known reference codebase you may consult but not copy. Re-implement the model and training loop in …
- Pytorch Or Tensorflow
- Paper Reproduction
- Experimental Design
AI/ML Practicum and Hands-on Lab - AnalysisSeniorNew
Cost-Quality Prompt Optimization at Scale
You receive 2,000 labeled code snippets (human rater consensus score 1-5) and budget for at most 8,000 API calls across the optimization run. Run a factorial sweep of 3 prompt s…
- Prompt Optimization
- Cost Quality Tradeoff
- Experimental Design
Prompt Engineering Build a verifiable portfolio.
Submissions become evidence. Reviewers with shipping experience score against a rubric; the result becomes a credential anyone can verify.
Why Ewance
- ResearchIntermediateNew
Reproduce a Vision-Model Paper Under a Reproducibility Standard
Pick a vision-model paper from CVPR or NeurIPS 2024-2025 with publicly available code and a manageable compute footprint (single-GPU under 24 hours). Reproduce the headline metr…
- Reproducibility
- Experimental Design
- Model Evaluation
AI Measurement and Evaluation - CodeIntermediateNew
Train a Deep Q-Network for Warehouse Robot Routing
You receive a Gymnasium-compatible warehouse simulator (50x50 grid, 8 dynamic obstacle pedestrians, 20 randomized pick locations) and a baseline A* planner script. Train a DQN a…
- Deep Q Learning
- Reinforcement Learning
- Pytorch Or Tensorflow
Deep Reinforcement Learning - AnalysisBeginnerNew
Run an A/B Test on Two System Prompts for a Sales Email Assistant
You will (1) design the A/B test (random assignment by rep_id, 50/50 split, 2-week duration), (2) instrument three primary metrics: reply rate (event-based), average tokens per …
- Prompt Evaluation
- A/B Testing
- Metric Design
LLM Application Development - ResearchSeniorNew
Pre-Register and Run a Small Neural-Network Ablation Study
You will study how three architectural and regularization choices (depth: 2/4/8 hidden layers; activation: ReLU vs. GELU; weight decay: 0 / 1e-4 / 1e-3) affect a small MLP's tes…
- Neural Networks
- Regularization
- Experimental Design
Machine Learning - ResearchIntermediateNew
Sim-to-Real Domain Randomization for a Mobile Robot
You receive an Isaac Sim navigation environment, a baseline trained policy, a 50-episode real-bench test set (recorded sensor streams + ground truth) for offline policy evaluati…
- Domain Randomization
- Sim To Real
- Robot Navigation
Robot Learning - ResearchSeniorNew
Graph Transformer Research Probe for a Drug-Target Predictor
You receive a public drug-target interaction dataset (around 50,000 drug-target pairs with labels and molecular graphs), a strong GIN baseline, and a starter GraphGPS implementa…
- Graph Transformers
- Neural Networks
- Message Passing
Machine Learning on Graphs - ResearchSeniorNew
Benchmark Reward-from-Feedback Methods on a Tabletop Pick-Place
You will use a Franka Panda arm in PyBullet on a 4-object pick-and-place task. For each of the three feedback methods, train a reward model and a downstream policy until converg…
- Reinforcement Learning
- Reward Learning
- Preference Comparison
Human-Robot Interaction - ResearchIntermediateNew
Lab Project: Compare Three Architectures on Your Own Mini-Benchmark
Scope the problem yourself (suggested examples: sentiment classification on a niche domain, tabular anomaly detection, time-series forecasting on a public dataset). Define the t…
- Experimental Design
- A/B Testing With Statistical Significance
- Pytorch Or Tensorflow
AI/ML Practicum and Hands-on Lab - ResearchSeniorNew
Stress-Test Scalable Oversight on a Tool-Using Agent
Design a sandwich-oversight study: pick a task domain where non-expert oversight is plausible but not trivial (e.g., reviewing data-analysis steps, checking small bug fixes, eva…
- Scalable Oversight
- Alignment Research
- Experimental Design
AI Safety and Alignment - DesignBeginnerNew
A/B-Test a Recommender Improvement Without Breaking Trust
You receive offline-evaluation results for both the production and candidate models plus aggregate metrics from the last 12 weeks (recipe views, save rate, weekly active users, …
- Experimental Design
- A/B Testing
- Metric Design
Machine Learning in Practice - ResearchIntermediateNew
Train a NeRF for Real-Estate Virtual Tours
You receive a curated dataset of 3 apartments, each with around 120 input images and known camera poses (already SfM-processed). Train a NeRF variant (Instant-NGP or Nerfacto re…
- Neural Scene Representation
- Nerf
- Pytorch Or Tensorflow
3D Vision and Multi-View Geometry - ResearchSeniorNew
Trajectory Prediction Model for Urban Robotaxis
Use the Argoverse 2 motion-forecasting dataset (open access). Train an LSTM baseline + a transformer challenger (e.g., a small Wayformer or HiVT). Evaluate on minADE/minFDE (min…
- Trajectory Prediction
- Transformer Models
- Evaluation
AI for Autonomous Vehicles
How it works
From brief to credential, in six steps.
Step 01
Browse challenges aligned to your studies.
Step 02
Accept the one that fits your goals.
Step 03
Work through it with AI Copilot guidance.
Step 04
Submit for structured evaluation.
Step 05
Earn a verified credential.
Step 06
Add it to LinkedIn with one click.
Industry teams behind a decade of practitioner briefs
Hiring from this pool?
Sponsor a challenge and meet candidates through actual work.
Industry teams can shape briefs around the skills they hire for, then evaluate students on rubric-scored deliverables — not resumes.



















































































