Frequent-Itemset Mining on a Grocery Retailer's Basket History

FreeVerified credential4 weeksAdvanced

Overview

What this challenge is about.

Load 18 months of basket-level transaction data (Parquet, around 92 GB) into a Spark cluster. Run FP-growth at support thresholds tuned per category (food vs household vs fresh). Filter for itemsets where the lift versus baseline co-occurrence is above 1.4 and the pair has been stable across at least 9 of 18 months. Propose 10 shelf-adjacency changes with expected basket-size lift, ranked by ease of physical implementation. Deliver an analysis notebook, an 8-page recommendation memo, and a follow-up A/B test plan for 6 stores.

CredentialBlockchain-anchored

ShareableLinkedIn-ready

LanguageEnglish

PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Mine 240 million baskets to propose 10 evidence-backed shelf-adjacency changes with expected basket-size lift and a follow-up A/B test plan.

Earning criteria — what you'll demonstrate

Run FP-growth at scale on real basket data, not toy market-basket examples
Tune support thresholds per category to avoid drowning in trivial itemsets
Distinguish lift from raw co-occurrence and pick stable patterns
Translate itemset patterns into operational store-layout decisions

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Data Mining and Information Retrieval

Master · Cs Se

Fit score: 1

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Analysis

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

Product Manager

Category-management PMs who can read mining output unlock evidence-based merchandising decisions.

This challenge sharpens

lift-analysis
ab-testing
data-storytelling

One more thing

You can put a credential on your CV by Friday.

Start this challenge