Extract Skills and Roles from Job Postings for a Recruiter Tool
Overview
What this challenge is about.
You receive 30,000 anonymized job postings and a labelled 1,000-posting benchmark with (skill, role, seniority) spans. Fine-tune a small token classifier (e.g., DeBERTa-v3-base) for span extraction. Build a normalization layer that maps extracted skill strings to ESCO skill URIs (fuzzy match + embedding similarity). Evaluate per-class F1 and end-to-end (extraction + normalization) accuracy on the held-out benchmark. Deliver: pipeline code, fine-tuned model, ESCO mapping notebook, and a 3-page deployment recommendation including a refresh strategy when ESCO updates.
The Brief
What you'll do, and what you'll demonstrate.
Turn raw job-posting text into ESCO-normalized (role, skill, seniority) tuples with measurable end-to-end accuracy.
Earning criteria — what you'll demonstrate
- Fine-tune a token classifier on a real IE task
- Normalise extracted entities to a public taxonomy (ESCO)
- Evaluate end-to-end IE + normalization accuracy
- Design a refresh strategy for taxonomy-dependent pipelines
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
NLP Engineer
Fine-tuning a token classifier plus a normalization layer is the day-to-day of NLP engineers at HR-tech and edtech vendors.
This challenge sharpens
- information-extraction
- token-classification
- entity-normalization
Machine Learning Engineer
End-to-end accuracy reporting plus a refresh strategy is core MLE work for taxonomy-dependent pipelines.
This challenge sharpens
- fine-tuning
- evaluation
- entity-normalization
Data Engineer
Owning the ESCO mapping refresh strategy and the pipeline that consumes it is the data-engineering backbone of any career-path product.
This challenge sharpens
- esco-taxonomy
- entity-normalization
- evaluation