There's a moment every jazz musician knows. You're sitting with your instrument, you've got a tune in your head - maybe Stella by Starlight, maybe Blue Bossa, maybe something you've been avoiding because the changes are hard - and you want to practice. Really practice. Not play along to a recording where the tempo is fixed and the key is someone else's choice. Not loop a two-bar phrase in a DAW. You want a rhythm section. A bass player who walks. A pianist who comps. A drummer who swings. You want the feel of playing with people, except it's Tuesday afternoon and everyone you know is at work.
I've been in that moment hundreds of times. For years I used iReal Pro, Band-in-a-Box, YouTube backing tracks, Aebersold records. They all helped. They all had limits. iReal Pro's sounds are functional but synthetic. Band-in-a-Box is powerful but feels like flying a 747 to go to the corner store. YouTube tracks lock you into someone else's arrangement. And Jamie Aebersold - bless that man for what he gave jazz education - recorded those tracks in specific keys at specific tempos, and if your range or your comfort level needs something different, you're on your own.
What I wanted was simple, and it didn't exist: load any tune, see the harmony every way I could imagine, generate a rhythm section that sounds musical, and practice. In my key. At my tempo. In my feel. Open source, so anyone could look under the hood. Free, because the best practice tools should be like the best teachers - available to anyone who shows up ready to work. JMove is the tool I built to fill that gap. What follows is how it works under the hood.
The Three Engines
JMove is built around three engines working together. The first is the generator - a backing track engine that creates walking bass, piano comping, and drum patterns from chord changes. It runs entirely in the browser, requires no server, and is published on npm as @jmove/generator with zero runtime dependencies. The second is an audio engine - the ability to take audio and extract the harmony from it using deep learning models for chord detection and pitch extraction. The third is an AI engine - chord intelligence, reharmonization, scale analysis. The idea is that these three engines form a complete cycle: hear music, understand it, reimagine it, practice it.
Each engine is written in the language that suits it best. The generator is TypeScript - it needs to run in real time in the browser, manipulating MIDI events with microsecond precision. The audio and AI engines are Python - they need spectral analysis libraries, music theory tools, deep learning frameworks, and server-side compute that the browser can't handle. Two languages, one codebase, connected by a FastAPI backend that orchestrates the heavy lifting while the browser handles everything it can handle alone.
JMove Architecture:
Browser (TypeScript) Server (Python)
├── @jmove/generator (npm) ├── AI Engine
│ ├── generateWalkingBass() │ ├── chord intelligence
│ ├── generatePianoComping() │ ├── reharmonization (14 techniques)
│ ├── generateDrumPattern() │ ├── voicing generation
│ └── 26 STYLE_PRESETS │ └── chord continuation
├── Next.js 16 + React 19 ├── Audio Engine
│ ├── 8 Zustand stores │ ├── chord detection
│ ├── VexFlow (score/tab) │ ├── pitch extraction
│ ├── Tone.js (audio) │ └── beat & tempo analysis
│ └── TensorFlow.js (synthesis) └── FastAPI + PostgreSQL + Redis
└── 1,511 standards (748KB JSON)
Zero runtime deps ──────────┘ ML/theory deps ──────────┘The Generator: Zero Dependencies by Design
The generator is the heart of the practice experience, and its most important architectural constraint is that it has zero runtime dependencies. Not "few" dependencies. Zero. The package.json lists TypeScript, ESLint, Vitest, and tsup as dev dependencies, but at runtime, @jmove/generator imports nothing. It takes chord events in, gives MIDI events out. It doesn't know about Tone.js, React, web audio, or any rendering layer. Pure musical logic, portable to any JavaScript environment.
This constraint matters because a practice tool that depends on a specific audio library or framework is a practice tool that breaks when that library changes its API. The generator has survived three major refactors of the web app's audio pipeline without a single change to its own code. Tone.js handles the sound. The generator handles the music. The boundary between them is a list of objects with pitch, time, duration, and velocity. Nothing else crosses that line.
// The generator's entire public surface
export {
// Core generation
generateJamSession,
generateWalkingBass,
generatePianoComping,
generateDrumPattern,
// Style system
STYLE_PRESETS, // 26 presets
STYLE_CATEGORIES, // grouped by tradition/modern/latin/groove/experimental
STYLE_LABELS, // human-readable names
autoDetectPreset, // iReal style -> JMove preset
// Groove & humanization
GM_DRUMS, // General MIDI drum map (18 instruments)
getGrooveTemplate, // per-style timing/feel
applyGroove, // apply template to events
tempoSwingMultiplier, // tempo-dependent swing ratio
instrumentSwingFactor,// per-instrument swing offset
dynamicMultiplier, // volume arc over phrase
// Types (23 exported)
type PracticeStyle, type JamKey, type JamForm,
type JamConfig, type JamResult, type ChordEvent,
type BassNote, type CompNote, type DrumHit,
type QuantizedScore, type StylePreset,
// ...
};45 Chord Qualities
The generator understands 45 distinct chord qualities - every harmonic color that appears in the real-world jazz repertoire. The major family has eleven members, from a bare major triad to the full Lydian stack (maj13#11). The minor family has seven, including the melodic minor tonic chord m(maj7) that Allan Holdsworth explored with such depth he redefined what the guitar could do with it. The dominant family is the largest with fifteen members - from a plain seventh to the double-altered 7#9b13 that Herbie Hancock would stack over a pedal bass in the Miles Davis Quintet.
Each quality maps to a chord tone array - the semitone intervals from the root that define the chord's identity. The entry for m7 is [0, 3, 7, 10]. The entry for 7alt is [0, 4, 7, 10]. The entry for dim7 is [0, 3, 6, 9] - four notes equidistant by minor thirds, creating the symmetrical structure that resolves in four different directions. These arrays drive everything: the walking bass chooses chord tones for beat two from this list, the piano comping builds voicings from these intervals, the drum pattern generator uses the chord quality to select comping density. The chord is the seed. Everything grows from it.
When the generator encounters a quality it doesn't recognize - a typo, an unusual extension, a notation convention it hasn't seen before - it doesn't crash. A heuristic parser walks the quality string looking for keywords: if it contains "dim," treat it as diminished. If it contains "sus," drop the major third. If it starts with "m" and doesn't start with "maj," it's minor. If it contains "7" or "9" or "13" without "maj," it's dominant. If nothing matches, fall back to a major triad. The system always produces music, even from imperfect input. A practice tool that throws an error when you misspell a chord symbol is a practice tool that interrupts your practice.
26 Style Presets
Each style preset is a complete personality - not just a tempo label but a set of parameters that shapes every aspect of the generated music. The swingAmount ranges from 0 (straight eighths for bossa and latin) to 90 (hard shuffle for blues). The density controls how busy the comping is - ballad at 30, hard bop at 75, funk at 85. The strumMs parameter sets the piano's chord-spread timing - bossa at 15-25 milliseconds for that gentle guitar-like strum, stride at 0 for percussive block chords, ballad at 30 for a soft rolled quality.
The presets are organized into five categories. Traditional holds swing, hard bop, West Coast cool, and ballad - the core vocabulary. Modern holds fusion, ECM, modal, and contemporary jazz. Latin holds bossa nova and Latin fire. Groove holds funk, jazz waltz, and blues shuffle. And Experimental holds the artist-specific presets: neo-soul, Holdsworth fusion, Alfa Mist, Pat Metheny, math rock, and IDM. Three hybrid presets combine per-instrument overrides - Fusion/ECM pairs a fusion bass with ECM piano and drums, creating cross-genre textures that don't exist in any single tradition.
// Style preset structure (26 total)
const STYLE_PRESETS: StylePreset[] = [
{
id: "swing",
label: "Classic Swing",
category: "traditional",
params: {
swingAmount: 70, // triplet-based swing feel
density: 55, // moderate comping density
strumMs: 8, // tight piano strum
tempoRange: [120, 180],
}
},
{
id: "ecm",
label: "ECM Space",
category: "modern",
params: {
swingAmount: 10, // near-straight, floating
density: 25, // sparse, lots of space
strumMs: 20, // wide, arpeggiated
tempoRange: [50, 90],
}
},
{
id: "alfaMist",
label: "Alfa Mist",
category: "experimental",
params: {
swingAmount: 20, // subtle swing, hip-hop influenced
density: 60, // moderate with rhythmic variety
strumMs: 12, // organic piano strum
tempoRange: [80, 130],
}
},
// ... 23 more presets
];Eight Zustand Stores
The web app manages state through eight Zustand stores, each responsible for a discrete domain. The syncStore handles playback coordination - current time, selected note, seek position - shared across all five notation views through a single useActiveNotes() hook. The suggestionStore manages the chord suggestion pipeline: the list of SuggestionItems returned from the Python engine, the selected continuation path, the voicing preview. The reharmonizeStore tracks which reharmonization technique is active and which chords have been substituted.
The separation is deliberate. When the piano roll needs to know the current playback time, it subscribes to syncStore. When the suggestion panel needs to display alternative chords, it subscribes to suggestionStore. Neither store knows about the other. The audio engine writes to syncStore on every animation frame; the notation views read from it. The API writes to suggestionStore when analysis completes; the UI reads from it. State flows in one direction, through narrow channels, and the stores never import each other. This isn't a philosophical position about architecture. It's a practical response to what happens when a music app's state management becomes tangled: notes stop syncing, playback stutters, the UI feels unresponsive. Clean boundaries prevent musical artifacts.
The Python Side
The TypeScript generator handles the real-time work - it runs in the browser at 60fps, converting chord symbols into MIDI events fast enough to keep up with a drummer at 200 BPM. But the heavy analytical work lives in Python. The AI engine wraps music21 - the MIT music theory library that Michael Cuthbert has been developing since 2006 - and uses it for scale family analysis, Roman numeral detection, and the voice-leading calculations that power the reharmonization engine's scoring system.
The audio engine is the intelligence layer for audio analysis. It takes a recording and extracts what a musician would extract by ear: the key, the tempo, the beat positions, the chord progression. Deep learning models handle chord detection from spectrograms and pitch-to-MIDI conversion. The results feed through a refinement step that applies jazz harmonic conventions to clean up the raw neural network output. A raw model might output "C E G B" and call it Cmaj7. The refinement step asks: in this context, with these surrounding chords, is it really Cmaj7, or is it more likely a C7 that the model missed because the flat seventh was masked by the vocal?
The FastAPI backend ties it all together with eleven endpoint groups - analysis, enrichment, reharmonization, suggestion, transcription, voicing generation, stem separation, track management, job status, roadmap voting, and health checks. Long-running jobs (transcription can take minutes on complex audio) are dispatched to an ARQ async queue backed by Redis. Results are cached. Files go to S3. Session data goes to PostgreSQL through Alembic-managed migrations. The API is the bridge between the browser's real-time world and the server's analytical depth.
1,511 Standards, Offline
One of the first things I built was the standards database. 1,511 jazz standards with full chord changes, parsed from the iReal Pro ecosystem through a five-stage pipeline that decodes obfuscated URIs, unscrambles fifty-character segments, tokenizes a cell-based chart language, maps 43 chord quality patterns, and assembles the result into JMove's internal QuantizedScore format. The entire database ships as a single 748-kilobyte JSON file that loads into memory when the page opens. No API call. No authentication. No internet required.
There's something almost sacred about jazz standards. They're the shared language of the music - the common ground where a musician in Tokyo and a musician in New Orleans can sit down together and play without a single word of rehearsal. Autumn Leaves. Satin Doll. Body and Soul. These tunes have been played millions of times by millions of musicians, and every single performance is different because the harmony is just the starting point. The melody is the invitation. What you do with the changes is who you are. Making those changes freely accessible - searchable, transposable, instantly playable with a backing track - felt like the most important thing the app could do.
Why Open Source
The generator is published under the MIT license. Anyone can install it, read the code, modify it, build on it. This was a deliberate decision. Jazz itself is open source. Every standard is a shared protocol. Every lick you learn from a Charlie Parker solo becomes part of your vocabulary, which becomes part of someone else's vocabulary when they hear you play it. The tradition is built on listening, absorbing, transforming, and passing it forward. A practice tool that serves that tradition should work the same way.
If a developer in São Paulo wants to understand how the walking bass algorithm approaches beat four, they can read the code - 1,831 lines of TypeScript in walkingBass.ts, with comments that explain not just what the code does but why. If a music teacher in Berlin wants to modify the bossa nova drum pattern for their students, they can fork the repo and change it. If a researcher wants to study the voice-leading optimization algorithm, the Viterbi DP implementation is right there in voiceLeadingOptimizer.ts with the transition cost function documented.
The Monorepo
The codebase is a polyglot monorepo orchestrated by a Makefile rather than Turborepo or Nx. The generator lives at the root level. The Python packages live under packages/ - engine and transcriber, each with their own setuptools configuration. The web app and API live under apps/. Docker Compose ties the services together for local development: the Next.js frontend, the FastAPI backend, PostgreSQL, Redis. A single make run starts everything.
The choice of Make over a JavaScript-native build orchestrator was practical: the repo isn't a JavaScript monorepo. It's a polyglot workspace where TypeScript, Python, Docker, and SQL migrations need to coexist. Make doesn't care what language your build target uses. It just runs commands and tracks dependencies. It's the oldest build tool in Unix and it works. Sometimes the best architecture decision is the boring one.
Built in the Shed
Jazz musicians don't say "I'm going to practice." They say "I'm going to the shed." Woodshedding. The term comes from the idea of a musician retreating to a literal shed - away from the house, away from distraction - to work on their craft in private. Charlie Parker reportedly spent years in the woodshed before he emerged with the bebop vocabulary that changed music forever. Sonny Rollins famously practiced on the Williamsburg Bridge for two years. The shed is where the real work happens - unglamorous, repetitive, essential.
JMove was built in a shed, for the shed. The app exists to make that time more productive, more musical, and more available. You shouldn't need a band to practice with a band. You shouldn't need expensive software to see your harmony five different ways. You shouldn't need anything more than curiosity and an instrument. Three engines, two languages, 45 chord qualities, 26 style presets, 1,511 standards, eight Zustand stores, zero runtime dependencies in the generator. All of it in service of one thing: a musician alone in a room, getting better.
If you play jazz - or want to - come shed with us. The rhythm section is always ready.
→ Dive deep into the generator's algorithms→ Explore all 14 reharmonization techniques