Map a Climate-Policy Corpus to Linked Open Data

FreeVerified credential3 weeksAdvanced

Overview

What this challenge is about.

You receive 12,000 policy PDFs and a benchmark of 200 documents with manually linked entities (places, organizations, policies). Build a pipeline that runs NER, candidate-generation against Wikidata + EuroVoc, and disambiguation (string similarity + KG-context similarity). Evaluate precision and recall at the entity level on the benchmark. Output the corpus as an RDF dataset following Linked Data principles (URIs dereferenceable, sameAs to Wikidata, attribution metadata) and publish it as a small zipped Turtle file plus a README the non-profit can host on their site.

CredentialBlockchain-anchored

ShareableLinkedIn-ready

LanguageEnglish

PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Link a 12,000-document climate-policy corpus to Wikidata and EuroVoc with measured precision and recall, and publish it as Linked Open Data.

Earning criteria — what you'll demonstrate

Build an end-to-end entity-linking pipeline against Wikidata and EuroVoc
Apply Linked Data publishing principles (URIs, sameAs, attribution)
Evaluate entity-linking quality at precision/recall level
Author methodology notes appropriate to grant-reporting expectations

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Knowledge Graphs and Semantic Web

Master · Ai Ml

Fit score: 1

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

Data Engineer

Publishing a corpus as Linked Open Data with measured linking quality is the day-to-day of data engineers at research and open-data orgs.

This challenge sharpens

linked-open-data
rdf
entity-linking

NLP Engineer

NER plus disambiguation against a real KG is core NLP-engineer work in any entity-linking product.

This challenge sharpens

ner
entity-linking
wikidata

AI Solutions Architect

Specifying the linked-data architecture and the publishing pipeline is the AI solutions architect's role in open-data engagements.

This challenge sharpens

linked-open-data
rdf
sparql

One more thing

You can put a credential on your CV by Friday.

Start this challenge