Computer & Information Sciences
Data Science Challenges
Real data-science projects and challenges on Ewance — clean messy datasets, build and evaluate models, and turn raw data into decisions the way a working data scientist does. Solve them to build a portfolio of verified, recruiter-checkable proof you can do the work — not just describe it.
Recommended challenges
- CodeIntermediateNew
Hierarchical Plans for an Aerospace Maintenance Crew Scheduler
You receive a synthetic week of 80 work orders with hierarchical decompositions, technician certifications, and shared-tool constraints. Implement an HTN planner (PyHOP or HDDL …
- Htn Planning
- Domain Modeling
- Constraint Handling
Automated Planning - ResearchIntermediateNew
Run an Alignment Probe on a Coding Assistant
You will design 240 probe prompts across 3 classes: (1) over-refusal (innocuous coding asks the model should fulfill), (2) insecure code patterns (asks where the model should wa…
- Red Team Operations
- Alignment Evaluation
- LLM Evaluation
Large Language Models - ResearchIntermediateNew
Audit a Public LLM Benchmark for Validity Threats
Choose one open LLM benchmark (e.g., MMLU, GPQA, BIG-Bench-Hard, MATH). Read the benchmark paper plus at least three follow-up critiques. Audit (1) data contamination risk again…
- Benchmark Evaluation
- Data Contamination Analysis
- Annotation Methodology
AI Measurement and Evaluation - ResearchBeginnerNew
Build an Accessibility Checklist for a Voice Health Assistant
You receive 20 audio samples spanning accents and speech patterns, the assistant's published dialog state machine, and a list of current voice prompts. Audit the assistant for i…
- Accessibility (Wcag 2.2)
- Interaction Design
- Evaluation
Human-Computer Interaction for AI Systems Practice your coursework on real scenarios.
Every challenge is shaped from real-world context — not generic exercises. The work mirrors what your degree prepares you for.
Why Ewance
- ResearchIntermediateNew
Quantify Distribution Shift for a Climate-Risk Model
You receive the model artifact (a gradient boosted regressor predicting expected annual loss per property), 2010-2020 training data, and a 2021-2024 holdout. Quantify covariate …
- Distribution Shift
- Covariate Shift
- Concept Drift
Trustworthy AI, Robustness, and Safety - CodeIntermediateNew
Predict Loan Default Risk for a Cross-Border Fintech
You receive 18 months of transactions (around 12M rows) and seller-firmographic data. Define a defensible proxy label for default (e.g., a 60-day chargeback-or-dispute spike com…
- Feature Engineering
- Model Selection
- Model Evaluation
Applied Machine Learning - StrategyBeginnerNew
Plan a Self-Improving Sales-Research Agent
Build the v0 agent: given a company URL, it gathers 5 fact bullets (recent news, headcount range, tech stack hints, hiring patterns, a recent leadership change) and drafts a 4-l…
- Ai Agents
- Agent Design
- A/B Testing & Experimentation
AI Agents and LLM-Based Agents - CodeIntermediateNew
Run a Monte Carlo Tree Search Strategy for a Robotics Pick-and-Place Task
You receive a simulator of the pick-and-place task: a bin with 10 randomly-placed parts, an action space of which part to pick next, and a reward = parts picked per minute with …
- Monte Carlo Tree Search
- Planning
- Simulation
Decision Making Under Uncertainty - Browse challenges
Explore role
Product Manager
Ship product that solves real user problems. Combine user research, prototyping, and stakeholder alignment to turn ambiguous briefs into measurable wins — the role at the centre of modern software teams.
- CodeIntermediateNew
Build a LangGraph Multi-Agent Researcher
Design the four-agent topology with explicit message contracts. Implement each agent as a separate LLM call with role-specific system prompts, tool access (web search for resear…
- Multi Agent Orchestration
- Langgraph Or Crewai Workflows
- Tool Use
Multi-Agent Systems - StrategyIntermediateNew
Designing a BI Strategy for a Regional Retail Chain
Your challenge is to design a BI architecture including a data warehouse (conceptual and logical models), an ETL strategy to integrate data from multiple sources (POS, inventory…
- Data Warehousing
- Etl Fundamentals
- Olap
Business Intelligence - AnalysisIntermediateNew
Benchmark NPUs for an Autonomous Forklift Vision Stack
You receive ONNX exports of the 3 production models, a labeled validation set of 2,000 forklift-camera frames, and developer-kit access to three NPU candidates (anonymized as NP…
- Edge Inference
- Npu Benchmarking
- Onnx Optimization
Edge ML and On-Device Machine Learning - AnalysisIntermediateNew
Evaluate an Agent Suite on the SWE-Bench-Style Coding Benchmark
You receive a sandboxed set of 50 small repo-modification tasks (test-passing as the success signal). Run 3 open-source agent frameworks (e.g., OpenHands, SWE-agent, and Aider) …
- Ai Agents
- Agent Evaluation
- Benchmarking
AI Agents and LLM-Based Agents Build a verifiable portfolio.
Submissions become evidence. Reviewers with shipping experience score against a rubric; the result becomes a credential anyone can verify.
Why Ewance
- CodeIntermediateNew
LLM-Powered FAQ Chatbot for 40-Person SaaS Scale-up
You have access to TaskFlow's internal documentation, help articles, and a sample of 500 support tickets. Your task is to build a retrieval-augmented generation (RAG) pipeline: …
- Large Language Models
- RAG Architectures
- Information Retrieval
Text Analytics and Natural Language Processing - StrategyBeginnerNew
Design an Internal AI-Use Policy for a Mid-Cap Bank
You receive the bank's existing IT-acceptable-use policy and a description of which AI tools are being rolled out (an internal Anthropic Claude wrapper for general use; a code-c…
- Ai Governance Frameworks
- Policy Design
- Responsible Ai
AI Ethics, Fairness, and Responsible AI - ResearchIntermediateNew
Run an Adversarial-Robustness Audit on a Face-Liveness Model for a Fintech
You receive a stand-in face-liveness model with the same backbone as the production model plus a labeled evaluation set of 2,000 frames. Apply three standard digital attacks (FG…
- Adversarial Robustness Research
- Face Liveness
- Pytorch Or Tensorflow
Deep Learning for Computer Vision - CodeBeginnerNew
Prototype a Multimodal Visual-Question-Answering Demo
You will use a small open-source vision-language model (e.g., LLaVA-1.5-7B or PaliGemma) and prompt-engineer it for the warehouse-VQA task. Build a Gradio web demo. Construct a …
- Vision Language Models
- Multimodal Perception
- Prompt Patterns
Machine Perception - ResearchIntermediateNew
Train a NeRF for Real-Estate Virtual Tours
You receive a curated dataset of 3 apartments, each with around 120 input images and known camera poses (already SfM-processed). Train a NeRF variant (Instant-NGP or Nerfacto re…
- Neural Scene Representation
- Nerf
- Pytorch Or Tensorflow
3D Vision and Multi-View Geometry - CodeIntermediateNew
Constitutional AI Critique Loop for Hallucination Reduction
You receive the meal-planning prompts (60 test cases with dietary constraints), an unrevised baseline (single-pass instruction-tuned model), and an empty nutrition-constraint co…
- Constitutional Ai
- Self Critique
- Alignment Prompting
Machine Learning from Human Preferences (RLHF and Alignment) - ResearchBeginnerNew
Evaluate Speech Synthesis Voices for an EdTech Storyteller App
You will generate 60 audio clips (20 per vendor) covering 4 story genres and 3 emotional tones. Recruit 15 native Spanish speakers via a remote panel (Prolific or local equivale…
- Tts Evaluation
- Listening Studies
- Mos Scoring
Speech Recognition and Spoken Language Processing - DesignIntermediateNew
Co-Design a Trust Layer for an Enterprise RAG Assistant
You will plan and run a 5-day remote co-design study with eight pilot users (a mix of plant operators and middle managers). Sessions 1-2: discover where trust breaks down. Sessi…
- Co Design
- User Research
- Trust And Transparency
Human-Computer Interaction for AI Systems - StrategyBeginnerNew
Pitch a Regulatory Sandbox Application for an Edtech AI Product
Read the EU AI Regulation's regulatory-sandbox provisions. Pick a member-state sandbox program (Spain, Norway-as-EEA, or a German-state pilot are publicly documented options) an…
- Regulatory Analysis
- Ai Governance Frameworks
- Product Strategy
AI Law, Policy, and Regulation - CodeIntermediateNew
Wire a Knowledge Graph into a Pharma RAG Assistant
You receive: 100 internal benchmark questions with reference answers; a 50,000-document anonymized RAG index; a curated drug-target-disease KG (~80,000 triples) loaded into a tr…
- Kg Grounded RAG
- Sparql
- Entity Linking
Knowledge Graphs and Semantic Web - CodeIntermediateNew
Safety-Critical Test Harness for an AV Planner
Use CARLA (open-source AV simulator) and encode 10 representative safety scenarios across 3 categories (cut-in, pedestrian emergence, signalized-intersection right-of-way). Writ…
- Simulation
- Scenario Testing
- Safety Evaluation
AI for Autonomous Vehicles - ResearchBeginnerNew
Case-Study Analysis of a Public AI Incident
Pick one public AI incident (suggestions: a chatbot's harmful response that went viral, a facial-recognition false-arrest case, a financial-model bias scandal). Produce a 6-page…
- Incident Analysis
- Responsible Ai
- Case Study Research
AI Ethics, Fairness, and Responsible AI
How it works
From brief to credential, in six steps.
Step 01
Browse challenges aligned to your studies.
Step 02
Accept the one that fits your goals.
Step 03
Work through it with AI Copilot guidance.
Step 04
Submit for structured evaluation.
Step 05
Earn a verified credential.
Step 06
Add it to LinkedIn with one click.
Industry teams behind a decade of practitioner briefs
Hiring from this pool?
Sponsor a challenge and meet candidates through actual work.
Industry teams can shape briefs around the skills they hire for, then evaluate students on rubric-scored deliverables — not resumes.



















































































