Verified credentials. On-chain. Forever.Your Ewance certificates last forever — on-chain, verifiable by anyone.Learn more

Evaluation

If you like applying Evaluation, every challenge here gives you a chance to practice it on a real industry brief.

Recommended Challenges

· Advanced only Clear

DesignAdvancedNew
Design Hybrid Search for an E-Commerce Product Catalog
You receive 80,000 anonymized product records (title, description, category, attributes) and a sample of 30,000 search log entries with click-through labels. Embed the catalog w…
- Hybrid Search
- Embedding Models
- Bm25
Vector Databases and Embeddings
CodeAdvancedNew
Multi-View Pose Estimation for a Sports-Analytics Startup
Use the publicly-released SoccerNet or a synthetic 4-view dataset (you can render with Unity or use a provided one). Implement a 2D pose estimator per view (HRNet or YOLOv8-pose…
- Pose Estimation
- Multi View Geometry
- 3d Reconstruction
Computer Vision
CodeAdvancedNew
Train a GNN for Fraud-Ring Detection at a Payments Fintech
You receive an anonymized transaction dataset (around 120,000 merchants, around 4 million transactions over 12 months, around 2% labeled fraud) and the team's LightGBM baseline.…
- Graph Neural Networks
- Graphsage
- Fraud Detection
Machine Learning on Graphs
CodeAdvancedNew
Extract Structured Lease Terms for a Commercial Real-Estate Platform
You receive 500 anonymized lease PDFs and a labelled gold set of 150 leases with the 14 fields filled in. Build a pipeline that does (1) layout-aware PDF parsing (Unstructured, …
- Information Extraction
- Pdf Parsing
- Named Entity Recognition
Linguistic Engineering and Language Technologies
Practice your coursework on real scenarios.
Every challenge is shaped from real industry context — not generic exercises. The work mirrors what your degree prepares you for.
Why Ewance
CodeAdvancedNew
Extract Skills and Roles from Job Postings for a Recruiter Tool
You receive 30,000 anonymized job postings and a labelled 1,000-posting benchmark with (skill, role, seniority) spans. Fine-tune a small token classifier (e.g., DeBERTa-v3-base)…
- Information Extraction
- Token Classification
- Esco Taxonomy
Linguistic Engineering and Language Technologies
AnalysisAdvancedNew
Imitation Learning from Human Demos for a Drone Inspection
You receive 6 hours of expert pilot demonstrations (state-action pairs at 20 Hz) recorded in an AirSim wind-farm environment with 3 turbine designs, plus a held-out 4th turbine …
- Imitation Learning
- Behavioral Cloning
- Dagger
Deep Reinforcement Learning
CodeAdvancedNew
Natural Language Inference for an HR-AI Compliance Tool
Use SNLI/MNLI/ANLI as starting data and curate 200 domain-specific HR examples (synthetic or anonymized) for fine-tuning. Fine-tune a small encoder (DeBERTa-v3-base or similar),…
- Natural Language Inference
- Transformer Models
- Fine Tuning
Computational Semantics
CodeAdvancedNew
Segment Cells from Microscopy Images for a Pharma-AI Discovery Lab
You receive 3,500 microscopy images with pixel-level cell masks plus a 200-image hold-out set re-annotated by two biologists for inter-annotator agreement. Train a U-Net or SegF…
- Semantic Segmentation
- U Net
- Pytorch
Deep Learning for Computer Vision
Explore role
Product Manager
Ship product that solves real user problems. Combine user research, prototyping, and stakeholder alignment to turn ambiguous briefs into measurable wins — the role at the centre of modern software teams.
Browse challenges
CodeAdvancedNew
Lambda-Calculus Semantic Parser for a Math-Tutor EdTech
Define a small typed lambda-calculus representation for linear equations and a small set of word-problem templates (rate, age, mixture). Build a parser that maps surface express…
- Semantic Parsing
- Lambda Calculus
- Symbolic Reasoning
Computational Semantics
CodeAdvancedNew
DPO Fine-Tune for a Domain-Specific Writing Assistant
You receive a base instruction-tuned model checkpoint plus 2,500 preference pairs from editorial reviews (each pair: two grant-application paragraphs, the editor-preferred winne…
- Dpo
- Preference Learning
- Model Finetuning
Machine Learning from Human Preferences (RLHF and Alignment)
CodeAdvancedNew
Fine-Tune a Transformer for Customer-Support Triage at an Enterprise AI Vendor
You receive 240,000 labeled support tickets across 14 queues, with English, Bahasa Indonesia, and Tagalog. Fine-tune a multilingual transformer encoder (XLM-RoBERTa-base is a st…
- Transformers
- Fine Tuning
- Multilingual Nlp
Deep Learning
CodeAdvancedNew
Build a Multimodal Generation Pipeline for a Tourism Operator
You receive 40 sample 30-second videos shot by tour guides, the operator's brand voice doc, and SEO keyword lists for EN/PT/ES. Build a pipeline that (1) extracts a representati…
- Multimodal Generation
- Vision Language Models
- Llm Inference
Generative AI
Build a verifiable portfolio.
Submissions become evidence. Reviewers with shipping experience score against a rubric; the result becomes a credential anyone can verify.
Why Ewance
ResearchAdvancedNew
Visual Question Answering for a Pediatric Radiology Workflow
You receive ~8,000 publicly available pediatric chest X-rays with structured findings labels (anonymized; no PHI access required). Build a VQA pipeline that maps a (image, quest…
- Vision Language Models
- Visual Question Answering
- Lora Finetuning
Visual Intelligence and Visual Reasoning
ResearchAdvancedNew
Train a NeRF for Real-Estate Virtual Tours
You receive a curated dataset of 3 apartments, each with around 120 input images and known camera poses (already SfM-processed). Train a NeRF variant (Instant-NGP or Nerfacto re…
- Neural Scene Representation
- Nerf
- Pytorch
3D Vision and Multi-View Geometry
CodeAdvancedNew
De-Identify Patient Images for a Pharma Research Pipeline
You receive 500 internal benchmark images (already cleared for use), each labelled with bounding boxes around face/tattoo/jewelry regions. Build a pipeline that detects these re…
- Image De Identification
- Object Detection
- Privacy Preserving Vision
Image Processing and Computational Imaging
AnalysisAdvancedNew
Detect Defects on a Production Line for a Tier-1 Auto Supplier
You receive 12,000 labelled grayscale part images (8,000 good, 4,000 defective across 6 defect types) at 2048x2048. Build a pipeline that does classical preprocessing (illuminat…
- Defect Detection
- Image Classification
- Image Preprocessing
Image Processing and Computational Imaging
CodeAdvancedNew
Semantic Parser for an Enterprise Analytics Assistant
Define a small typed query language (filter, aggregate, group_by, time_range, metric). Curate or write 200 training examples covering the controlled subset and 50 held-out test …
- Semantic Parsing
- Grammar Design
- Transformer Models
Computational Semantics
CodeAdvancedNew
Scene-Graph Generation for Retail Shelf Audits
You receive 1,500 labeled shelf photos (anonymized product crops, bounding boxes, and ~12 relation types). Build a pipeline that, for a new shelf photo, outputs (a) detected pro…
- Scene Graph Generation
- Object Detection
- Relation Prediction
Visual Intelligence and Visual Reasoning
CodeAdvancedNew
Train a Sequence Model for Wearable-Telemetry Sleep Staging at a Healthtech
You receive 220 nights of wearable telemetry from 60 subjects with PSG ground-truth labels. Train three sequence models: an LSTM baseline, a 1D-CNN+GRU hybrid, and a small trans…
- Sequence Models
- Lstm
- Transformers
Deep Learning
CodeAdvancedNew
Build a LangGraph Multi-Agent Researcher
Design the four-agent topology with explicit message contracts. Implement each agent as a separate LLM call with role-specific system prompts, tool access (web search for resear…
- Multi Agent Orchestration
- Langgraph
- Llm Tool Use
Multi-Agent Systems
ResearchAdvancedNew
Benchmark Graph-Embedding Methods on a Climate-Network Dataset
You receive a 200M-edge sample of the knowledge graph and a labeled entity-similarity test set (5,000 pairs with relevance labels). Benchmark three methods: a shallow embedding …
- Graph Embeddings
- Graph Neural Networks
- Scalable Ml
Machine Learning at Scale
CodeAdvancedNew
Extractive QA on Clinical Trial Protocols
You receive 500 anonymized protocol PDFs (already OCR-ed to text) and 1,200 labeled question-answer pairs where each answer is an exact text span. Build an extractive QA system:…
- Extractive Qa
- Reading Comprehension
- Model Finetuning
Question Answering and Conversational Systems
CodeAdvancedNew
Build an Audio-Visual Speaker Diarization Pipeline
Build the pipeline: face detection + active-speaker detection on video, voice-activity detection + speaker embeddings on audio, then a fusion step that ties tracks to detected f…
- Audio Visual Fusion
- Speaker Diarization
- Active Speaker Detection
Multimodal Machine Learning
CodeAdvancedNew
Build a Multilingual Customer-Email Classifier
You receive 28,000 labeled emails (skewed toward English and Mandarin). Try at least two approaches: (1) a fine-tuned multilingual transformer (XLM-RoBERTa or mDeBERTa) and (2) …
- Text Classification
- Multilingual Nlp
- Transformers
Natural Language Processing

How it works

From brief to credential, in six steps.

Step 01
Browse challenges aligned to your studies.
Step 02
Accept the one that fits your goals.
Step 03
Work through it with AI Copilot guidance.
Step 04
Submit for structured evaluation.
Step 05
Earn a verified credential.
Step 06
Add it to LinkedIn with one click.

Industry teams behind a decade of practitioner briefs

Hiring from this pool?

Sponsor a challenge and meet candidates through actual work.

Industry teams can shape briefs around the skills they hire for, then evaluate students on rubric-scored deliverables — not resumes.

Explore sponsorship

Evaluation Challenges | Ewance