Train a Word-Alignment Model for Low-Resource Catalan-Aranese

FreeVerified credential2 weeksIntermediate

Overview

What this challenge is about.

You receive a 35,000-sentence Catalan-Aranese parallel corpus plus a 1,200-pair manually annotated word-alignment test set. Train (1) a classic statistical alignment baseline (eflomal or fast_align) and (2) a neural alignment model (e.g., SimAlign or Awesome-Align with a multilingual encoder). Evaluate Alignment Error Rate (AER) and Precision/Recall. Discuss the trade-offs (training time, inference speed, dependency on external models) for a public-sector procurement audience. Deliver a 3-page recommendation memo.

CredentialBlockchain-anchored

ShareableLinkedIn-ready

LanguageEnglish

PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Pick the best word-alignment model for a low-resource Catalan-Aranese language pair under a public-procurement constraint set.

Earning criteria — what you'll demonstrate

Train and evaluate statistical and neural word alignment models
Apply Alignment Error Rate and related metrics correctly
Reason about low-resource language constraints and tooling availability
Communicate trade-offs to a public-sector procurement audience

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Machine Translation

Master · Ai Ml

Fit score: 1

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

NLP Engineer

Word-alignment work on low-resource pairs is exactly the niche NLP engineers own at govtech and translation-tooling shops.

This challenge sharpens

alignment
low-resource-mt
transformer

Applied AI Scientist

Picking a model that fits public-sector procurement constraints (open-source, supportability) is core applied-AI work at consultancies serving governments.

This challenge sharpens

model-evaluation
low-resource-mt
alignment

ML Researcher

Comparing statistical and neural alignment with rigorous AER reporting is the kind of focused ML-research deliverable that small NLP labs hire for.

This challenge sharpens

neural-mt
alignment
model-evaluation

One more thing

You can put a credential on your CV by Friday.

Start this challenge