Drug-Repurposing Candidate Screen with Embedding Similarity

FreeVerified credential2 weeksIntermediate

Overview

What this challenge is about.

You receive (1) a list of 15 known therapeutic candidates (SMILES + ChEMBL identifiers) for a single rare disease, (2) a database of about 4,500 marketed drugs (SMILES + ATC codes). Compute molecular embeddings using two methods (Morgan fingerprints + ChemBERTa-style learned embeddings). For each method, rank marketed drugs by similarity to the known-candidate centroid. Produce a top-50 shortlist, annotate each with primary ATC class and known indications. Discuss what the two methods agree and disagree on. Deliver a 4-page memo for the medicinal-chemistry team.

CredentialBlockchain-anchored

ShareableLinkedIn-ready

LanguageEnglish

PaceSelf-paced

The Brief

What you'll do, and what you'll demonstrate.

Build a two-method computational drug-repurposing screen and deliver an annotated top-50 shortlist medicinal chemists can discuss.

Earning criteria — what you'll demonstrate

Apply molecular embeddings (chemo-informatic + neural) to a real screening question
Run a centroid-based similarity ranking and reason about its assumptions
Compare classical and neural embedding methods on the same task
Frame computational-screen output respectfully for medicinal chemists

Program Fit

Where this fits in your program.

Sharpens the same skills your degree expects you to demonstrate.

Machine Learning for Healthcare and Biomedicine

Master · Ai Ml

Fit score: 1

Skills

Skills you'll demonstrate.

Each one shows up on your verified credential.

Careers

Roles this prepares you for.

Real titles. Real skill bridges. Pick the one closest to your trajectory.

Career paths this builds toward

Canonical roles

Applied AI Scientist
AI Research

Applied AI Scientist

Computational-screen pipelines with chemistry-team-readable outputs are the applied-AI-scientist's daily work at any AI-forward drug-discovery startup.

This challenge sharpens

molecular-embeddings
similarity-search
transformer

ML Researcher

Comparing classical chemoinformatic and learned neural embeddings on the same screening task is the kind of focused ML-research study small biotech labs value.

This challenge sharpens

molecular-embeddings
transfer-learning
transformer

Data Scientist

Pairing a similarity pipeline with a respectful, chemist-readable memo is exactly the cross-functional data-scientist work biotechs hire for.

This challenge sharpens

similarity-search
exploratory-data-analysis
molecular-embeddings

One more thing

You can put a credential on your CV by Friday.

Start this challenge