Model Evaluation
If you like applying Model Evaluation, every challenge here gives you a chance to practice it on a real industry brief.
- CodeIntermediateNew
Image-Classification Model for a Quality-Control Line at a Bottling Plant
Train an image classifier on 8,000 labeled bottle images (3 defect classes + 'ok'). Use transfer learning from a pre-trained backbone (EfficientNet-B0 or MobileNetV3) — the line…
- Deep Learning
- Supervised Learning
- Ml Applications
Machine Learning (CS Elective) - AnalysisIntermediateNew
Stress-Test a Hiring-Funnel Model for Bias
You receive a synthetic-but-realistic dataset of 25,000 past applicants with features (years of experience, education tier, prior role tags) and outcome labels (advanced past th…
- Model Evaluation
- Fairness Metrics
- Logistic Regression
Machine Learning (Undergraduate) - CodeIntermediateNew
Reduce Dimensionality on Sensor Streams for a Mid-Cap Robotics OEM
You receive 120 robot-hours of windowed sensor data (5s windows, 240 channels) with labels for normal vs. one of four fault classes. Implement (1) PCA, (2) kernel PCA with an RB…
- Dimensionality Reduction
- Kernel Methods
- Autoencoders
Machine Learning - CodeIntermediateNew
Ship a Lightweight ML Microservice for an EdTech Reading App
You receive 3 months of session telemetry (around 50M reading events, child-anonymized). Engineer features per session window, train a small classifier (logistic regression base…
- Feature Engineering
- Model Serving
- Containerization
Applied Machine Learning Practice your coursework on real scenarios.
Every challenge is shaped from real industry context — not generic exercises. The work mirrors what your degree prepares you for.
Why Ewance
- CodeIntermediateNew
Build a Robust Image Classifier for a Climate-Tech Satellite Startup
You receive a labeled dataset of about 25,000 Sentinel-2 patches (positive = illegal construction visible, negative = not). The dataset is split by region AND by season so you c…
- Data Augmentation
- Deep Learning
- Pytorch
Advanced Deep Learning - CodeIntermediateNew
Build a Credit-Card Fraud Detector for a Singapore Neobank
You receive 9 months of anonymized authorization data (around 8 million transactions, around 0.4 percent fraud) plus current rule outcomes. Split temporally and train at least t…
- Classification Modeling
- Class Imbalance
- Model Calibration
AI and Quantitative Finance - AnalysisIntermediateNew
Cost-Model a Foundation-Model API Migration
You receive: 90 days of API logs (request volume, token distributions), the customer's golden eval set of 200 prompts, the incumbent and new pricing schedules, and quality ratin…
- Cost Modeling
- Ai Strategy
- Model Evaluation
AI for Business and AI Product Management - AnalysisIntermediateNew
Detect Fraudulent Refund Requests for a Mid-Market Marketplace
You receive a labeled dataset with buyer history, seller history, shipping carrier, refund reason text, and outcome label (legit / fraud). Train and evaluate at least two classi…
- Classification
- Model Calibration
- Imbalanced Classification
Machine Learning (Undergraduate) - Browse challenges
Explore role
Product Manager
Ship product that solves real user problems. Combine user research, prototyping, and stakeholder alignment to turn ambiguous briefs into measurable wins — the role at the centre of modern software teams.
- CodeIntermediateNew
Churn-Prediction Model for a B2B Vertical SaaS
Use 18 months of anonymized data (provided) covering: usage events, login frequency, support tickets, NPS responses, billing health, feature adoption, practice firmographics. De…
- Supervised Learning
- Python Programming
- Ml Applications
Machine Learning (CS Elective) - AnalysisIntermediateNew
Evaluate Speech-to-Text Quality for a Contact-Center Analytics Vendor
You receive 200 anonymized call-recording snippets (2-4 minutes each, ~67 per language) with reference transcripts plus a domain glossary of about 600 product terms. Run all thr…
- Speech Recognition
- Sequence Models
- Model Evaluation
Machine Perception - AnalysisIntermediateNew
Analyze a Learning-Analytics Dataset for At-Risk Detection
You receive an anonymized dataset of LMS engagement features (logins, assignment submissions, forum posts, video-watch time), grade history, and a binary label for end-of-semest…
- Learning Analytics
- Classification
- Fairness Metrics
AI in Education and Learning Analytics - CodeIntermediateNew
Ship a Churn-Prediction Mini-Project End to End
You receive a 12-month anonymized dataset of subscriber events (logins, lesson completions, payment history, support tickets) for around 200,000 users. Define churn precisely (n…
- Feature Engineering
- Model Evaluation
- Gradient Boosting
AI/ML Practicum and Hands-on Lab Build a verifiable portfolio.
Submissions become evidence. Reviewers with shipping experience score against a rubric; the result becomes a credential anyone can verify.
Why Ewance
- CodeIntermediateNew
Build a Face-Anonymization Tool for a Civic-Tech Newsroom
Use a pretrained face detector (RetinaFace or YOLOv8-face is fine). Build a Python tool with a Gradio or Streamlit UI that: (1) detects faces in an uploaded photo, (2) shows det…
- Object Detection
- Image Processing
- Opencv
Computer Vision (Undergraduate) - AnalysisIntermediateNew
Audit a Hiring-Screening Model for Demographic Bias
You receive: (a) inference API access to the production model (black-box), (b) a 12,000-resume audit benchmark with self-declared gender and age-band labels (consented, GDPR-com…
- Fairness Metrics
- Bias Auditing
- Model Evaluation
AI Ethics, Fairness, and Responsible AI - AnalysisIntermediateNew
Optimize Hyperparameters with Bayesian Optimization on a Tight Budget
You receive a B2B-SaaS churn dataset (about 12,000 customer-month rows, 38 features) and a fixed sweep budget of 40 trials per model family. Implement a Bayesian optimizer (Optu…
- Bayesian Optimization
- Hyperparameter Tuning
- Ensemble Methods
Advanced Machine Learning - CodeIntermediateNew
Train a Word-Alignment Model for Low-Resource Catalan-Aranese
You receive a 35,000-sentence Catalan-Aranese parallel corpus plus a 1,200-pair manually annotated word-alignment test set. Train (1) a classic statistical alignment baseline (e…
- Alignment
- Neural Mt
- Low Resource Mt
Machine Translation - CodeIntermediateNew
Image-Quality Triage Tool for a Tele-Radiology Network
You receive 10,000 chest-X-ray images with multi-label quality flags (rotation, clipping, motion). Train a small multi-label CNN that outputs a per-flag probability and a single…
- Medical Imaging
- Classification
- Convolutional Neural Networks
Machine Learning for Imaging and Medical Image Analysis - AnalysisIntermediateNew
Customer Churn Prediction for 40-Person SaaS Scale-Up
You receive a dataset with 500 customers and 10 features (e.g., monthly logins, number of support tickets, contract length, industry). Your task is to perform exploratory analys…
- Logistic Regression
- Classification
- Feature Engineering
Econometrics - CodeIntermediateNew
Tune a Recommender for an EU Streaming Music App
Use the public Last.fm-360k or similar dataset (anonymized listening histories) as a stand-in. Implement a baseline matrix-factorization recommender, then a hybrid that adds tra…
- Recommender Systems
- Feature Engineering
- Model Evaluation
Applied Machine Learning - AnalysisIntermediateNew
Approximate Inference for a Topic Model on Customer Tickets
You receive 180,000 tickets (subject + body) spanning the last 18 months. Preprocess into a bag-of-words representation with sensible stopwords and bigrams. Fit a 20-topic LDA v…
- Variational Inference
- Latent Dirichlet Allocation
- Approximate Inference
Probabilistic Graphical Models - CodeIntermediateNew
Team Practicum: Build a Crop-Disease Classifier with a Field Partner
You receive a labeled dataset of about 8,000 phone photos plus around 1,200 unlabeled photos from a held-out county. Audit and clean the labels (expect 5-10% noise), train a Mob…
- Transfer Learning
- Pytorch
- Model Evaluation
AI/ML Practicum and Hands-on Lab - CodeIntermediateNew
Build a Fairness Evaluation Harness for a Credit-Score Model
Implement a Python module that, given model predictions, ground truth, and group identifiers, computes demographic parity difference, equal-opportunity difference, predictive-pa…
- Algorithmic Fairness
- Statistical Evaluation
- Python
AI Measurement and Evaluation - CodeIntermediateNew
Build an Embedding-Based Semantic Search for a Legal-Document Corpus
Embed the 380k-document corpus using a multilingual sentence-transformer (e.g. multilingual MPNet or LaBSE). Store embeddings in FAISS or pgvector. Build a search service that r…
- Deep Learning
- Ml Applications
- Python Programming
Machine Learning (CS Elective) - CodeIntermediateNew
Markov Random Field for Image Segmentation in Crop Monitoring
You receive 60 Sentinel-2 image tiles (10-meter resolution) over 12 vineyards, each tile with per-pixel disease labels from agronomist field walks. Take the consultancy's existi…
- Markov Random Fields
- Graph Cuts
- Image Segmentation
Probabilistic Graphical Models
How it works
From brief to credential, in six steps.
Step 01
Browse challenges aligned to your studies.
Step 02
Accept the one that fits your goals.
Step 03
Work through it with AI Copilot guidance.
Step 04
Submit for structured evaluation.
Step 05
Earn a verified credential.
Step 06
Add it to LinkedIn with one click.
Industry teams behind a decade of practitioner briefs
Hiring from this pool?
Sponsor a challenge and meet candidates through actual work.
Industry teams can shape briefs around the skills they hire for, then evaluate students on rubric-scored deliverables — not resumes.



















































































