Doppel

Find songs that sound like the one you love — not just what other listeners clicked.

Doppel is a hybrid retrieve-then-rerank engine: cultural sources surface candidates, CLAP audio embeddings rerank them by how they actually sound, and an LLM explains the picks — but never ranks them.

19/19benchmark seeds audio-scored across 8 genres·median resolve found-ratio0.987
cultural recallperceptual audio reranktext vibe-steeringgrounded rationale
replay console·11 recorded runs·latest capture 2026-06-12·no live backend
replay

Picking a seed replays the recordedpipeline run for that exact request — stage-by-stage, from real persisted telemetry. Arbitrary live input takes ~12 min cold, so the engine never runs on this site; the deep dive walks the cold→warm story in prose.

Livethe production engine, right nowconnecting…
API
Corpus
Queries served
Last backup
Independent monitor (healthchecks.io):healthchecks.io status

Fetching the latest snapshot the VPS pushed to Cloudflare R2…

The short version

How a “find songs that sound alike” idea became a two-leg retrieve-then-rerank pipeline.

  1. 01

    The problem

    “Play me something like this” usually resolves to collaborative filtering, which drifts toward whatever is already popular — or to taste graphs that never actually listen to the song.

  2. 02

    Two dead ends

    Asking an LLM to read a track’s BPM and key assumes it has heard the song (it hasn’t), and Spotify closed those audio endpoints to new apps in 2024. Pre-embedding a royalty-free corpus satisfies the math but answers a chart hit with thirty tracks nobody has heard of.

  3. 03

    The wedge

    The fix is to let each leg cover the other’s blind spot: cultural sources know what listeners treat as similar, the audio model knows what actually sounds similar. Neither is trustworthy alone; together they are.

  4. 04

    The evidence

    19/19 benchmark seeds audio-scored across 8 genres, and the rerank visibly reshapes the cultural shortlist — the funnel on every result page shows that narrowing on real numbers.

Vibe-steered runs

The same seed, reshaped by a free-text mood. Directional steering, not a hard filter — the text encoder is a deliberately weak leg.