AI & Data
NLP Challenges
NLP challenges put you inside the work of teaching machines to read and make sense of language. You'll develop skills in Natural Language Processing fundamentals, Text Tokenization and Word Embeddings, and tasks like Named Entity Recognition and Sequence labeling using NLTK.
From there you'll handle the harder edges — Encoder fine-tuning (BERT family) with Hugging Face Transformers, Custom tokenization, Relation extraction, Information Retrieval, and Multilingual NLP — building Knowledge Representation the way real NLP teams do. Each challenge you solve earns a verified credential you can share with recruiters.
- CodeIntermediateNew
Adapt Machine Translation to a Niche Domain
Pick an open MT base (NLLB-200 or a strong open M2M model). Build a parallel corpus of around 8,000 sentence pairs from the company's bilingual safety standards. Fine-tune on th…
- Machine Translation
- Domain Adaptation
- Hugging Face Transformers
Natural Language Processing - ResearchBeginnerNew
Drug-Repurposing Candidate Screen with Embedding Similarity
You receive (1) a list of 15 known therapeutic candidates (SMILES + ChEMBL identifiers) for a single rare disease, (2) a database of about 4,500 marketed drugs (SMILES + ATC cod…
- Molecular Embeddings
- Similarity Search
- Transfer Learning
Machine Learning for Healthcare and Biomedicine - CodeIntermediateNew
Fine-Tune a Small Transformer for Legal-Domain EN-DE Translation
You receive a 120,000-segment parallel EN-DE legal corpus and a held-out 1,000-segment test set with reference translations. Fine-tune a small pretrained Transformer (e.g., NLLB…
- Neural Mt
- Hugging Face Transformers
- Fine Tuning
Machine Translation - CodeIntermediateNew
LLM-Powered FAQ Chatbot for 40-Person SaaS Scale-up
You have access to TaskFlow's internal documentation, help articles, and a sample of 500 support tickets. Your task is to build a retrieval-augmented generation (RAG) pipeline: …
- Large Language Models
- RAG Architectures
- Information Retrieval
Text Analytics and Natural Language Processing Develop in-demand professional skills.
Each challenge names the skills it strengthens. Over time, your profile fills with the competences a hiring manager would actually look for.
Why Ewance
- CodeIntermediateNew
Instruction-Tune a Small Model for an Edtech Tutor
You receive a 1.5B base model (e.g., SmolLM-1.7B or Qwen-1.8B), permission to use 2 hours of a rented A100, and a curated seed of around 5,000 math-tutoring dialogues. Augment w…
- Instruction Tuning
- Fine Tuning
- Dataset Curation
Fine-Tuning Large Language Models - CodeIntermediateNew
Build a Vector-Search Backend for an Enterprise AI Knowledge Assistant
You receive a corpus of around 20,000 PDFs (mixed scanned and digital) totalling around 30 GB and a labeled retrieval set of 200 queries with human-judged ground-truth passages.…
- RAG Architectures
- Vector Database Basics
- Word Embeddings
Data Engineering and Big Data Systems - CodeIntermediateNew
Domain-Adapt an NLP Pipeline from News to Customer-Support Tickets
You receive 30,000 anonymized customer-support tickets (PT-BR + ES) plus the news-trained NER and intent models. Apply continued pretraining of a multilingual encoder (e.g., XLM…
- Transfer Learning
- Domain Adaptation
- Continued Pretraining
Meta-Learning, Transfer Learning, and Multi-Task Learning - CodeIntermediateNew
Description-Logic Reasoner for Insurance-Policy Coverage Checks
You receive 50 representative coverage rules in plain English (from the current rule engine) and a sample of 1,000 anonymized claim cases with the current engine's outcomes (cov…
- Description Logics
- Owl
- Reasoning
Fuzzy Logic, Knowledge Representation, and Symbolic Reasoning - Browse challenges
Explore role
Strategy Analyst
Frame the business question, model the options, build the recommendation. From market sizing to competitive analysis, this role is where strategy consulting meets in-house decision-making.
- DesignIntermediateNew
Visualize Embedding Drift for a RAG Knowledge Assistant
You receive weekly snapshots over 12 weeks of around 50,000 document embeddings each (1024-dim). Design and build a visualization tool that: (a) projects each snapshot to 2D wit…
- Word Embeddings
- Dimensionality Reduction
- Umap
Data Visualization - CodeBeginnerNew
Build an Embedding-Based Semantic Search for a Legal-Document Corpus
Embed the 380k-document corpus using a multilingual sentence-transformer (e.g. multilingual MPNet or LaBSE). Store embeddings in FAISS or pgvector. Build a search service that r…
- Deep Learning
- Ml Applications
- Python Or Javascript
Machine Learning (CS Elective) - CodeIntermediateNew
Train a Sequence Model for Wearable-Telemetry Sleep Staging at a Healthtech
You receive 220 nights of wearable telemetry from 60 subjects with PSG ground-truth labels. Train three sequence models: an LSTM baseline, a 1D-CNN+GRU hybrid, and a small trans…
- Sequence Models
- Lstm
- Hugging Face Transformers
Deep Learning - ResearchSeniorNew
Compare RNN vs Transformer for Long-Sequence Modeling
Pick a public trajectory dataset (e.g., Argoverse 2, Waymo Open, or ETH-UCY). Implement three models with comparable parameter counts (around 5M each): an LSTM baseline, a vanil…
- Hugging Face Transformers
- Rnn
- State Space Models
Neural Networks for NLP Get recognized by recruiters and employers.
Credentials are blockchain-anchored via LearnCoin — tamper-evident, portable, link-shareable on LinkedIn and beyond.
Why Ewance
- CodeIntermediateNew
Extract Structured Lease Terms for a Commercial Real-Estate Platform
You receive 500 anonymized lease PDFs and a labelled gold set of 150 leases with the 14 fields filled in. Build a pipeline that does (1) layout-aware PDF parsing (Unstructured, …
- Information Extraction
- Pdf Parsing
- Named Entity Recognition
Linguistic Engineering and Language Technologies - ResearchIntermediateNew
QLoRA Fine-Tune for a Customer-Support Domain Assistant
You receive 8,000 anonymized support ticket pairs (question -> agent response), the company's product documentation (around 600 pages), and a strong RAG baseline already running…
- Qlora
- Fine Tuning
- RAG Architectures
Fine-Tuning Large Language Models - CodeIntermediateNew
Distributional Embeddings for a Multilingual Legal Search
Use a public multilingual corpus (e.g., MultiEURLEX or a subset of EUR-Lex) plus a small hand-built test set of around 100 cross-lingual query-passage pairs. Fine-tune (or evalu…
- Distributional Semantics
- Multilingual NLP
- Sentence Embeddings
Computational Semantics - DesignBeginnerNew
Build an Attention-Visualization Tool for Translation Quality Audit
You will load a small open-source EN-FR transformer (e.g., Helsinki-NLP Opus-MT-en-fr), build a Streamlit or Gradio demo that lets the user paste English source, see the French …
- Attention Mechanisms
- Neural Mt
- Tool Design
Machine Translation - CodeBeginnerNew
Build a Multilingual Text-Mining Dashboard for Hotel Reviews
You receive 200,000 sampled reviews across 9 languages plus an English-only labeled benchmark of 1,000 reviews for sentiment and aspect (rooms, food, staff, value, location). Bu…
- Multilingual NLP
- Sentiment Analysis
- Aspect Extraction
Linguistic Engineering and Language Technologies - ResearchIntermediateNew
Probe a Pretrained Encoder for Linguistic Knowledge
Take BERT-base (or DeBERTa-v3-base). Run layer-wise probes across at least 3 linguistic tasks: part-of-speech tagging, dependency arc classification, and semantic role labeling.…
- Interpretability
- Probing
- Hugging Face Transformers
Neural Networks for NLP - CodeIntermediateNew
Fine-Tune a Sequence-to-Sequence Model for Code-Doc Generation
Take a small base model (CodeT5+ or a distilled CodeLlama-Instruct). Build the dataset by mining around 8,000 high-quality function-docstring pairs from permissively-licensed Py…
- Seq2seq
- Hugging Face Transformers
- Fine Tuning
Neural Networks for NLP - ResearchBeginnerNew
Survey Information-Retrieval Research for an AdTech Platform's Roadmap
Build a reading list of 30-40 papers spanning SIGIR, RecSys, KDD, WSDM, and arXiv from 2023-2025 across (a) dense retrieval architectures, (b) learning-to-rank with click feedba…
- Information Retrieval
- Learning To Rank
- Research Synthesis
Data Mining and Information Retrieval - CodeBeginnerNew
Train a Word-Alignment Model for Low-Resource Catalan-Aranese
You receive a 35,000-sentence Catalan-Aranese parallel corpus plus a 1,200-pair manually annotated word-alignment test set. Train (1) a classic statistical alignment baseline (e…
- Alignment
- Neural Mt
- Low Resource Mt
Machine Translation - CodeIntermediateNew
Build a Domain-Specific Named-Entity Recognizer for Legal Contracts
Start from a strong English NER base (spaCy transformer or LegalBERT). Fine-tune on a provided 1,200-contract labeled dataset for the 9 entity types. Handle long contracts (ofte…
- Named Entity Recognition
- Sequence Labeling
- Domain Adaptation
Natural Language Processing
How it works
From brief to credential, in six steps.
Step 01
Browse challenges aligned to your studies.
Step 02
Accept the one that fits your goals.
Step 03
Work through it with AI Copilot guidance.
Step 04
Submit for structured evaluation.
Step 05
Earn a verified credential.
Step 06
Add it to LinkedIn with one click.
Industry teams behind a decade of practitioner briefs
Hiring from this pool?
Sponsor a challenge and meet candidates through actual work.
Industry teams can shape briefs around the skills they hire for, then evaluate students on rubric-scored deliverables — not resumes.



















































































