Model Evaluation

If you like applying Model Evaluation, every challenge here gives you a chance to practice it on a real industry brief.

Recommended Challenges

· Intermediate only Clear

CodeIntermediateNew
Build a Neural Surrogate for Computational Fluid Dynamics in HVAC Design
Use a published CFD dataset (e.g., AirfRANS or a small in-house dataset if available) of around 1,000 steady-state airflow simulations on 2D building zones. Train a Fourier Neur…
- Neural Operators
- Surrogate Modeling
- Computational Fluid Dynamics
AI for Science and Engineering
AnalysisIntermediateNew
Build a Bayesian Credit-Scoring Model for an Emerging-Markets Fintech
You receive an anonymized snapshot of about 30,000 historical applications with features (income proxy, tenure on platform, prior loans, region) and the binary default outcome. …
- Bayesian Learning
- Credit Scoring
- Model Evaluation
Advanced Machine Learning
CodeIntermediateNew
Train a Multimodal Classifier for Medical Triage
Pick a fusion architecture (early fusion via cross-attention, late fusion via score combination, or a unified multimodal encoder like FLAVA/CoCa). Train on the 14,000 pairs with…
- Multimodal Fusion
- Cross Attention
- Pytorch Or Tensorflow
Multimodal Machine Learning
CodeIntermediateNew
Predict Loan Default Risk for a Cross-Border Fintech
You receive 18 months of transactions (around 12M rows) and seller-firmographic data. Define a defensible proxy label for default (e.g., a 60-day chargeback-or-dispute spike com…
- Feature Engineering
- Model Selection
- Model Evaluation
Applied Machine Learning
Practice your coursework on real scenarios.
Every challenge is shaped from real-world context — not generic exercises. The work mirrors what your degree prepares you for.
Why Ewance
CodeIntermediateNew
Forecasting Model for Online-Game Daily Active Users
Build forecasts at 14-day horizon per region using: (1) classical baseline — SARIMA or Prophet; (2) ML approach — gradient-boosted regressor on engineered features (day-of-week,…
- Supervised Learning
- Time Series Forecasting
- Python Or Javascript
Machine Learning (CS Elective)
CodeIntermediateNew
Diagnose Equipment Failures with a Bayesian Network
You receive 90 days of sensor logs (vibration, spindle temperature, coolant flow, ambient humidity), the maintenance log of 180 failure events labeled by root cause, and a short…
- Bayesian Networks
- Probabilistic Inference
- Parameter Learning
Probabilistic Graphical Models
CodeIntermediateNew
Triage Medical-Imaging Annotations with a Small Vision Model
Train a binary normal/abnormal classifier on the public CheXpert or NIH ChestX-ray14 dataset. Use temperature scaling to calibrate the output, then define abstention thresholds …
- Cnn Classification
- Transfer Learning
- Calibration
Applied Machine Learning
AnalysisIntermediateNew
Run a Pre-Deployment Fairness + Drift Audit on a Hiring Model
You receive a trained classifier (joblib), the training data sample, and a held-out 'next-month' evaluation set. Compute group fairness metrics (false-positive-rate gap, true-po…
- Fairness Metrics
- Drift Detection
- Bias Mitigation
Machine Learning in Practice
Explore role
Product Manager
Ship product that solves real user problems. Combine user research, prototyping, and stakeholder alignment to turn ambiguous briefs into measurable wins — the role at the centre of modern software teams.
Browse challenges
CodeIntermediateNew
Build an Anomaly-Detection Pipeline for Pharma Cold-Chain Logistics
You receive 18 months of shipment telemetry (around 60,000 shipments, around 12 million sensor readings) plus a hand-labeled set of 1,200 incidents (mix of true excursions, sens…
- Anomaly Detection
- Feature Engineering
- Time Series Basics
Data Mining and Knowledge Discovery
AnalysisIntermediateNew
Transfer-Learning Backbone Bake-Off for Retail Product Tagging
You receive 80,000 retail product images tagged with multiple labels from a 250-tag taxonomy. Use each of the three pretrained backbones via two transfer strategies: (1) linear …
- Transfer Learning
- Fine Tuning
- Supervised Learning
Meta-Learning, Transfer Learning, and Multi-Task Learning
CodeIntermediateNew
Train a Differentially Private Classifier on Medical Records
Use Opacus (PyTorch DP-SGD library). Train a tabular classifier (small MLP + gradient-boosted features) with DP-SGD at the agreed epsilon/delta. Run an accuracy-vs-privacy front…
- Differential Privacy
- Dp Sgd
- Opacus
Privacy-Preserving Machine Learning
CodeIntermediateNew
Build an Ensemble Strategy for Marketing-Mix Modelling
You receive 36 months of weekly marketing-spend and outcome data for 8 sample brands. Build a per-brand baseline gradient-boosting MMM model, then build two more base learners (…
- Ensemble Methods
- Stacking
- Time Series Cv
Machine Learning
Build a verifiable portfolio.
Submissions become evidence. Reviewers with shipping experience score against a rubric; the result becomes a credential anyone can verify.
Why Ewance
AnalysisIntermediateNew
Chest-X-Ray Deployment Audit Across Hospital Sites
You receive (1) a vendor-supplied multi-label chest-X-ray classifier, (2) the current single-site held-out evaluation set, (3) a 12,000-image multi-site evaluation set with 14-f…
- Medical Imaging
- Classification
- Model Evaluation
Machine Learning for Imaging and Medical Image Analysis
AnalysisIntermediateNew
Compare ML Compiler Stacks on a Vision Backbone
Take a frozen ResNet-50 (or similar) in ONNX. Compile and benchmark it via TensorRT on Jetson + GPU, ONNX Runtime on all three, OpenVINO on x86 CPU, and IREE on ARM if time allo…
- Ml Compilers
- Tensorrt
- Onnx Optimization
Machine Learning Systems
AnalysisIntermediateNew
Compare Stereo Depth Methods for a Drone Inspection Startup
You receive 500 calibrated stereo pairs from a turbine inspection plus sparse LiDAR ground truth on each pair. Implement (or wrap) three depth estimators: OpenCV Semi-Global Mat…
- Stereo Depth Estimation
- Multi View Geometry
- Model Evaluation
3D Vision and Multi-View Geometry
ResearchIntermediateNew
Reproduce a Vision-Model Paper Under a Reproducibility Standard
Pick a vision-model paper from CVPR or NeurIPS 2024-2025 with publicly available code and a manageable compute footprint (single-GPU under 24 hours). Reproduce the headline metr…
- Reproducibility
- Experimental Design
- Model Evaluation
AI Measurement and Evaluation
AnalysisIntermediateNew
Audit a Sepsis Early-Warning Model for Subgroup Performance
You receive a pre-trained vendor model, the training-data summary, and a held-out hospital-network evaluation set (about 18,000 ICU stays with sepsis labels). Compute AUROC + AU…
- Model Evaluation
- Fairness Metrics
- Model Calibration
Machine Learning for Healthcare and Biomedicine
AnalysisIntermediateNew
Structured Prediction for Insurance Claim Triage
You receive 18,000 historical claims with text, attachments-count, claim amount, customer tenure, and the ground-truth final routing bucket. Train a structured classifier (e.g.,…
- Structured Prediction
- Multi Class Classification
- Model Evaluation
Advanced Machine Learning
ResearchIntermediateNew
Multi-Task Learning for a Healthtech Triage Model
You receive 40,000 anonymized de-identified intake-form records with two labels: urgency tier (4 classes) and routed sub-specialty (12 classes). Train (1) two independent classi…
- Multi Task Learning
- Transfer Learning
- Hugging Face Transformers
Meta-Learning, Transfer Learning, and Multi-Task Learning
CodeIntermediateNew
Lane-Change Intent Classifier from Dashcam Video
Use a public driving video dataset (e.g., Argoverse 2 sensor or BDD100K) and curate ~6,000 short clips labeled with the three-class intent. Train a temporal model (e.g., a small…
- Video Understanding
- Temporal Modeling
- Model Evaluation
Visual Intelligence and Visual Reasoning
ResearchIntermediateNew
Disease-Progression Modelling for a Neurodegeneration Biotech
You receive a curated longitudinal Parkinson's cohort (about 1,200 patients, 4-12 visits each, MDS-UPDRS sub-scores, cognitive assessments, demographics). Fit (1) a linear mixed…
- Disease Progression Modeling
- Mixed Effects Models
- State Space Models
Machine Learning for Healthcare and Biomedicine
CodeIntermediateNew
Defect Detection on PCBs for a Hardware-AI Manufacturer
Use the publicly-available PCB defect dataset (e.g., DeepPCB or HRIPCB). Fine-tune a small object detector (YOLOv8n or RT-DETR-small) on the 6 defect classes. Evaluate mean Aver…
- Object Detection
- Transfer Learning
- Model Evaluation
Computer Vision
CodeIntermediateNew
Forecast Energy Demand for a Nordic Renewable Utility
You receive 5 years of hourly residential-segment demand, hourly weather data (temperature, wind, irradiance), and a calendar of public holidays. Build a probabilistic forecaste…
- Time Series Forecasting
- Probabilistic Modeling
- Feature Engineering
Applied Machine Learning
CodeIntermediateNew
Build a Sequence Model for Sign-Language Word Recognition
You receive about 12,000 short (1-3s) webcam clips covering a 50-word vocabulary, with body+hand pose features pre-extracted (e.g., MediaPipe Holistic landmarks per frame). Buil…
- Sequence Models
- Hugging Face Transformers
- Pose Estimation
Machine Perception

How it works

From brief to credential, in six steps.

Step 01
Browse challenges aligned to your studies.
Step 02
Accept the one that fits your goals.
Step 03
Work through it with AI Copilot guidance.
Step 04
Submit for structured evaluation.
Step 05
Earn a verified credential.
Step 06
Add it to LinkedIn with one click.

Industry teams behind a decade of practitioner briefs

Hiring from this pool?

Sponsor a challenge and meet candidates through actual work.

Industry teams can shape briefs around the skills they hire for, then evaluate students on rubric-scored deliverables — not resumes.

Explore sponsorship

Model Evaluation Challenges | Ewance