Fine-Tune a 3B Open-Weight Model for Customer Support Triage
Overview
What this challenge is about.
You receive 40,000 anonymized labelled support tickets across 18 categories. Fine-tune a 3B open-weight model using parameter-efficient fine-tuning (LoRA) for the classification head. Evaluate per-category F1 on a 5,000-ticket held-out set against the vendor API baseline. Measure inference latency at batch size 1 and 8 on a single A100 or L4 GPU. Deliver: training notebook, trained adapter, benchmark report, and a 3-page deployment recommendation covering the fallback path, monitoring, and a cost-per-1k-tickets comparison.
The Brief
What you'll do, and what you'll demonstrate.
Replace a vendor classification API with a fine-tuned open-weight 3B model that beats it on quality, cost, or both — with a fallback plan.
Earning criteria — what you'll demonstrate
- Apply LoRA fine-tuning to a 3B open-weight model on a real classification task
- Benchmark a fine-tuned model against a vendor API on quality, latency, and cost
- Design a deployment with a fallback path and basic monitoring
- Reason about data-residency benefits of in-house LLMs
Program Fit
Where this fits in your program.
Sharpens the same skills your degree expects you to demonstrate.
Skills
Skills you'll demonstrate.
Each one shows up on your verified credential.
Careers
Roles this prepares you for.
Real titles. Real skill bridges. Pick the one closest to your trajectory.
Machine Learning Engineer
Owning a LoRA fine-tune from data to deployment recommendation is core MLE work at any AI-forward company moving off vendor APIs.
This challenge sharpens
- lora-fine-tuning
- classification
- deployment-design
AI Engineer
Wiring an open-weight model into a production-shaped service with monitoring and fallback is the AI-engineer skillset that scaling teams hire for.
This challenge sharpens
- open-weight-llms
- inference-benchmarking
- deployment-design
MLOps Engineer
The cost/latency benchmark plus the fallback design bridges directly into MLOps work on serving platforms.
This challenge sharpens
- inference-benchmarking
- deployment-design
- llm-evaluation