Data Sciences Challenges

Explore data science challenges on Ewance to build skills employers expect from analysts and ML engineers. Work through challenges on data cleaning, exploratory analysis, modeling, and storytelling with data.

Explore Categories

All Business Computer Sciences Data Sciences Health Information Technology Social Sciences

Most Popular

All Foundational Beginner Intermediate Senior

DesignBeginnerNew
Design a Retrieval Pipeline for a Climate-Research Open Archive
You receive a metadata sample (5,000 documents) plus 50 example researcher queries (mixed-language). Design a retrieval pipeline architecture that: (1) extracts and normalizes s…
- Retrieval Architecture
- Hybrid Search
- Multilingual Search
Open coursework
CodeBeginnerNew
Build a Product Knowledge Graph for a Fast-Fashion Retailer
You receive 200 sample SKUs across 4 markets (Spain, Germany, Japan, Brazil) as CSVs with country-specific attribute names. Design an OWL ontology with shared classes for Produc…
- Knowledge Graphs
- Owl Ontology
- Rdf
Knowledge Graphs and Semantic Web
ResearchBeginnerNew
Curate a Domain Lexicon for a Climate-Tech NLP Stack
You receive 5,000 policy documents and a benchmark of 200 documents with manually tagged domain terms. Curate a lexicon of ~1,500 terms with (1) canonical English form, (2) Swah…
- Lexical Resources
- Named Entity Recognition
- Spacy
Linguistic Engineering and Language Technologies
CodeBeginnerNew
Predict Subscription Churn for an EdTech Platform
You receive a CSV with about 18,000 student-month rows: features include login frequency, session length, quiz scores, parent app opens, and plan tier. The target is whether the…
- Supervised Learning
- Logistic Regression
- Gradient Boosting
Machine Learning (Undergraduate)
Develop in-demand professional skills.
Each challenge names the skills it strengthens. Over time, your profile fills with the competences a hiring manager would actually look for.
Why Ewance
AnalysisBeginnerNew
Detect Fraudulent Refund Requests for a Mid-Market Marketplace
You receive a labeled dataset with buyer history, seller history, shipping carrier, refund reason text, and outcome label (legit / fraud). Train and evaluate at least two classi…
- Classification
- Model Calibration
- Imbalanced Classification
Machine Learning (Undergraduate)
AnalysisBeginnerNew
Stress-Test a Hiring-Funnel Model for Bias
You receive a synthetic-but-realistic dataset of 25,000 past applicants with features (years of experience, education tier, prior role tags) and outcome labels (advanced past th…
- Model Evaluation
- Fairness Metrics
- Logistic Regression
Machine Learning (Undergraduate)
StrategyBeginnerNew
Scope a Demand-Forecasting Model with Operations Stakeholders
You receive recorded interview transcripts (or summary notes) for the three personas, plus a sample of the historical sales data. Map each stakeholder's pain to candidate ML pro…
- Stakeholder Framing
- Ml Problem Scoping
- Metric Design
Machine Learning in Practice
CodeBeginnerNew
Reduce Dimensionality on Sensor Streams for a Mid-Cap Robotics OEM
You receive 120 robot-hours of windowed sensor data (5s windows, 240 channels) with labels for normal vs. one of four fault classes. Implement (1) PCA, (2) kernel PCA with an RB…
- Dimensionality Reduction
- Kernel Methods
- Autoencoders
Machine Learning
Explore role
Strategy Analyst
Frame the business question, model the options, build the recommendation. From market sizing to competitive analysis, this role is where strategy consulting meets in-house decision-making.
Browse challenges
AnalysisBeginnerNew
Evaluate Speech-to-Text Quality for a Contact-Center Analytics Vendor
You receive 200 anonymized call-recording snippets (2-4 minutes each, ~67 per language) with reference transcripts plus a domain glossary of about 600 product terms. Run all thr…
- Speech Recognition
- Sequence Models
- Model Evaluation
Machine Perception
CodeBeginnerNew
Prototype a Multimodal Visual-Question-Answering Demo
You will use a small open-source vision-language model (e.g., LLaVA-1.5-7B or PaliGemma) and prompt-engineer it for the warehouse-VQA task. Build a Gradio web demo. Construct a …
- Vision Language Models
- Multimodal Perception
- Prompt Engineering
Machine Perception
DesignBeginnerNew
Build an Attention-Visualization Tool for Translation Quality Audit
You will load a small open-source EN-FR transformer (e.g., Helsinki-NLP Opus-MT-en-fr), build a Streamlit or Gradio demo that lets the user paste English source, see the French …
- Attention Mechanisms
- Neural Mt
- Tool Design
Machine Translation
CodeBeginnerNew
Train a Word-Alignment Model for Low-Resource Catalan-Aranese
You receive a 35,000-sentence Catalan-Aranese parallel corpus plus a 1,200-pair manually annotated word-alignment test set. Train (1) a classic statistical alignment baseline (e…
- Alignment
- Neural Mt
- Low Resource Mt
Machine Translation
Get recognized by recruiters and employers.
Credentials are blockchain-anchored via LearnCoin — tamper-evident, portable, link-shareable on LinkedIn and beyond.
Why Ewance
ResearchBeginnerNew
Drug-Repurposing Candidate Screen with Embedding Similarity
You receive (1) a list of 15 known therapeutic candidates (SMILES + ChEMBL identifiers) for a single rare disease, (2) a database of about 4,500 marketed drugs (SMILES + ATC cod…
- Molecular Embeddings
- Similarity Search
- Transfer Learning
Machine Learning for Healthcare and Biomedicine
CodeBeginnerNew
Ship a Churn-Prediction Mini-Project End to End
You receive a 12-month anonymized dataset of subscriber events (logins, lesson completions, payment history, support tickets) for around 200,000 users. Define churn precisely (n…
- Feature Engineering
- Model Evaluation
- Gradient Boosting
AI/ML Practicum and Hands-on Lab
CodeBeginnerNew
Team Practicum: Build a Crop-Disease Classifier with a Field Partner
You receive a labeled dataset of about 8,000 phone photos plus around 1,200 unlabeled photos from a held-out county. Audit and clean the labels (expect 5-10% noise), train a Mob…
- Transfer Learning
- Pytorch
- Model Evaluation
AI/ML Practicum and Hands-on Lab
AnalysisBeginnerNew
Build a Topic-Modeling Pipeline for Citizen Feedback
Take the 60,000 comments (anonymized). Build a BERTopic pipeline with multilingual sentence embeddings (Catalan + Spanish + occasional English). Tune number-of-topics via topic-…
- Topic Modeling
- Bertopic
- Multilingual NLP
Natural Language Processing
AnalysisBeginnerNew
Approximate Inference for a Topic Model on Customer Tickets
You receive 180,000 tickets (subject + body) spanning the last 18 months. Preprocess into a bag-of-words representation with sensible stopwords and bigrams. Fit a 20-topic LDA v…
- Variational Inference
- Latent Dirichlet Allocation
- Approximate Inference
Probabilistic Graphical Models
CodeBeginnerNew
Markov Random Field for Image Segmentation in Crop Monitoring
You receive 60 Sentinel-2 image tiles (10-meter resolution) over 12 vineyards, each tile with per-pixel disease labels from agronomist field walks. Take the consultancy's existi…
- Markov Random Fields
- Graph Cuts
- Image Segmentation
Probabilistic Graphical Models
CodeBeginnerNew
Calibrate a Demand Forecast with Bayesian Confidence Intervals
You receive 24 months of weekly demand for 600 SKUs plus the existing XGBoost point predictions. Fit a Bayesian conformal-prediction layer (or, alternatively, a Gaussian-Process…
- Bayesian Inference
- Uncertainty Quantification
- Conformal Prediction
Probabilistic Machine Learning
CodeBeginnerNew
Structured-Output Prompts for Invoice Extraction
You receive 300 real invoice transcripts (already OCR-ed) labeled with 14 target fields, plus the current production prompt and its 12 percent failure log. Design a new prompt u…
- Structured Output
- Json Schema
- Few Shot Prompting
Prompt Engineering
CodeBeginnerNew
Open-Domain QA over Product Documentation
You receive a snapshot of the documentation (Markdown) and 120 real support questions with the URLs of pages containing the answer. Build an open-domain QA pipeline: chunk the d…
- Open Domain Qa
- Passage Retrieval
- Reading Comprehension
Question Answering and Conversational Systems
DesignBeginnerNew
Conversational UI for a Personal-Finance Assistant
You will work from 4 scripted scenarios: 'how much did I spend on coffee last month', 'why did my rent payment fail', 'help me set up an emergency fund', and an out-of-scope 'is…
- Conversational Ui
- Dialogue Design
- Trust Design
Question Answering and Conversational Systems
CodeBeginnerNew
Tabular Q-Learning for Warehouse Slotting
You receive a Python discrete-event simulator with state encoded as a 12-dimensional categorical vector (around 8,000 reachable states) and 6 possible slotting actions, plus 2 y…
- Tabular Rl
- Q Learning
- Epsilon Greedy
Reinforcement Learning
CodeBeginnerNew
Hybrid Search RAG for a HR-Policy Assistant
You receive 1,800 pages of policy documents (Markdown) and 150 labeled question-answer pairs with the gold source policy IDs. Build a hybrid retrieval pipeline: BM25 + dense emb…
- Hybrid Search
- Bm25
- Dense Retrieval
Retrieval-Augmented Generation

How it works

From brief to credential, in six steps.

Step 01
Browse challenges aligned to your studies.
Step 02
Accept the one that fits your goals.
Step 03
Work through it with AI Copilot guidance.
Step 04
Submit for structured evaluation.
Step 05
Earn a verified credential.
Step 06
Add it to LinkedIn with one click.

Industry teams behind a decade of practitioner briefs

Hiring from this pool?

Sponsor a challenge and meet candidates through actual work.

Industry teams can shape briefs around the skills they hire for, then evaluate students on rubric-scored deliverables — not resumes.

Explore sponsorship

Data Sciences Challenges | Ewance