Spaces:
Sleeping
Sleeping
Change app_file to main.py which patches stem rendering before building UI"
Browse files
README.md
CHANGED
|
@@ -5,7 +5,7 @@ colorFrom: green
|
|
| 5 |
colorTo: purple
|
| 6 |
sdk: gradio
|
| 7 |
sdk_version: 6.13.0
|
| 8 |
-
app_file:
|
| 9 |
pinned: true
|
| 10 |
license: mit
|
| 11 |
suggested_hardware: zero-a10g
|
|
@@ -16,108 +16,10 @@ tags:
|
|
| 16 |
- audio
|
| 17 |
- mixing
|
| 18 |
- stem-separation
|
| 19 |
-
-
|
| 20 |
short_description: AI analyzes songs, plans transitions, renders DJ sets
|
| 21 |
---
|
| 22 |
|
| 23 |
# ποΈ AI DJ Set Builder
|
| 24 |
|
| 25 |
-
**
|
| 26 |
-
|
| 27 |
-
## How It Works
|
| 28 |
-
|
| 29 |
-
### 1οΈβ£ Upload & Analyze
|
| 30 |
-
Upload audio files. Each track is analyzed for:
|
| 31 |
-
- **BPM** β spectral flux onset detection + autocorrelation
|
| 32 |
-
- **Musical key** β Krumhansl-Schmuckler on CQT chroma β Camelot wheel mapping
|
| 33 |
-
- **Energy profile** β RMS curve, loudness dB, spectral centroid (brightness)
|
| 34 |
-
- **Structural segments** β checkerboard kernel novelty on MFCC+chroma self-similarity (Foote 2000)
|
| 35 |
-
- **Cue points** β segment boundaries + 16-bar phrase energy transitions, snapped to downbeats
|
| 36 |
-
|
| 37 |
-
### 2οΈβ£ Compatibility & Set Planning
|
| 38 |
-
Multi-dimensional compatibility scoring for every track pair:
|
| 39 |
-
- **BPM proximity** (35%) β 94.5% of real DJ transitions are within Β±5% ([arxiv:2008.10267](https://arxiv.org/abs/2008.10267))
|
| 40 |
-
- **Harmonic compatibility** (30%) β Camelot wheel distance
|
| 41 |
-
- **Energy flow** (20%) β smooth energy transitions
|
| 42 |
-
- **Timbral similarity** (15%) β spectral centroid proximity
|
| 43 |
-
|
| 44 |
-
Tracks ordered via greedy energy-arc algorithm (warm up β peak β cool down) with configurable no-repeat constraint.
|
| 45 |
-
|
| 46 |
-
### 3οΈβ£ 15 Professional Transition Techniques
|
| 47 |
-
|
| 48 |
-
Every transition is structured as a sequence of *phases* with multi-band EQ automation, volume curves, and effects sends β mirroring how a real DJ moves faders and knobs over the course of a mix.
|
| 49 |
-
|
| 50 |
-
#### Foundational
|
| 51 |
-
| Type | Description | Stems? |
|
| 52 |
-
|------|-------------|--------|
|
| 53 |
-
| `eq_crossfade` | 3-band EQ crossfade β lows swap first (S-curve), then mids, then highs. The bread-and-butter of professional DJing. | No |
|
| 54 |
-
| `long_blend` | Extended 64-beat blend with slow S-curve automation on all three EQ bands plus mid-scoop to prevent muddiness. | No |
|
| 55 |
-
| `bass_swap` | Surgical bass exchange on a downbeat using isolated stems β incoming kick/bass replaces outgoing while mids+highs crossfade. | Yes |
|
| 56 |
-
| `slam` | Hard cut precisely on a downbeat with a 10ms anti-click micro-fade. Maximum impact for peak-time moments. | No |
|
| 57 |
-
|
| 58 |
-
#### Filter-based
|
| 59 |
-
| Type | Description | Stems? |
|
| 60 |
-
|------|-------------|--------|
|
| 61 |
-
| `resonant_filter_sweep` | Resonant LP closes on outgoing (with rising Q) while resonant HP opens on incoming β creates the classic 'whoosh'. | No |
|
| 62 |
-
| `hpf_buildup` | HPF rises exponentially on outgoing (removing kick), incoming drops in full. Mimics the tension of a live breakdown. | No |
|
| 63 |
-
|
| 64 |
-
#### Effects-based
|
| 65 |
-
| Type | Description | Stems? |
|
| 66 |
-
|------|-------------|--------|
|
| 67 |
-
| `dub_echo_out` | Pioneer DJM-style dub echo β tempo-synced delay with HP filter on the feedback path, creating thin spacey echo trails. | No |
|
| 68 |
-
| `reverb_wash` | Reverb send cranks up on outgoing creating a wash, tail is progressively HP-filtered while incoming S-curves in. | No |
|
| 69 |
-
| `beat_repeat_stutter` | Last beat sliced at accelerating subdivisions (1/4β1/8β1/16β1/32) creating a stutter/riser effect. | No |
|
| 70 |
-
| `spinback` | Simulated vinyl spinback β rapid pitch drop to silence while incoming fades in. Short, dramatic, playful. | No |
|
| 71 |
-
|
| 72 |
-
#### Stem-based (require demucs)
|
| 73 |
-
| Type | Description | Stems? |
|
| 74 |
-
|------|-------------|--------|
|
| 75 |
-
| `acapella_over_instrumental` | Outgoing vocals isolated and layered over incoming instrumental. Creates a bridge between both tracks. | Yes |
|
| 76 |
-
| `drums_first` | Incoming drums/percussion enter first, then bass swaps, then mids+highs β a layered reveal. | Yes |
|
| 77 |
-
| `double_drop` | Both tracks at full energy with complementary EQ carving β one gets bass, the other gets highs. Combined super-drop. | Yes |
|
| 78 |
-
|
| 79 |
-
#### Creative / tension-and-release
|
| 80 |
-
| Type | Description | Stems? |
|
| 81 |
-
|------|-------------|--------|
|
| 82 |
-
| `breakdown_swap` | Outgoing breaks down (LP closes), quiet zone, incoming builds and drops. Maximum narrative tension. | No |
|
| 83 |
-
| `noise_riser_cut` | Synthesized white noise riser (rising HP filter) builds tension over 16 beats β hard cut to incoming. Festival-style. | No |
|
| 84 |
-
|
| 85 |
-
### Context-Aware Selection
|
| 86 |
-
The transition selector considers:
|
| 87 |
-
- **Set position** β opening (gentle entries) β peak (high-energy techniques) β closing (wind-down)
|
| 88 |
-
- **Energy direction** β rising (buildups, hpf) vs falling (echo out, reverb wash)
|
| 89 |
-
- **Structural features** β if track A ends with a breakdown, use `breakdown_swap`
|
| 90 |
-
- **Harmonic/BPM compatibility** β determines what's physically possible
|
| 91 |
-
- **Variety** β never repeats the same technique consecutively
|
| 92 |
-
|
| 93 |
-
### DSP Engine
|
| 94 |
-
- **3-band EQ** β Pioneer DJM-style crossover (Low < 250Hz, Mid 250-2500Hz, High > 2500Hz)
|
| 95 |
-
- **Automation curves** β S-curves, exponential ease-in/out, equal-power, stepped, hold-then-drop (not just linear fades)
|
| 96 |
-
- **DJM-style dub echo** β tempo-synced delay with HP on feedback path
|
| 97 |
-
- **Long reverb** β golden-ratio tap spacing with LP damping
|
| 98 |
-
- **Beat stutter** β grain slicing at accelerating subdivisions
|
| 99 |
-
- **Vinyl spinback** β pitch interpolation simulation
|
| 100 |
-
- **Noise riser** β synthesized white noise with rising HP filter
|
| 101 |
-
- **HTDemucs stem separation** β 9.20 dB SDR for surgical bass swaps and vocal isolation
|
| 102 |
-
|
| 103 |
-
## Benchmarks
|
| 104 |
-
|
| 105 |
-
| Metric | Target | What It Measures |
|
| 106 |
-
|--------|--------|-----------------|
|
| 107 |
-
| BPM Detection Stability | β₯0.85 | Inter-beat-interval consistency |
|
| 108 |
-
| Key Detection Confidence | β₯0.65 | Krumhansl-Schmuckler correlation |
|
| 109 |
-
| Structural Segmentation | β₯3 segments/track | Novelty-based boundary detection |
|
| 110 |
-
| Cue Point Detection | β₯2 per track | Segment + phrase boundary coverage |
|
| 111 |
-
| Transition Smoothness | β€2.0 flux ratio | Spectral flux (transition vs steady-state) |
|
| 112 |
-
| Transition Variety | 0 consecutive repeats | Context-aware selection diversity |
|
| 113 |
-
|
| 114 |
-
## Research Foundation
|
| 115 |
-
|
| 116 |
-
- **DJtransGAN** (2021): 4-band EQ + fader automation learned from 7,064 real DJ transitions β [arxiv:2110.06525](https://arxiv.org/abs/2110.06525)
|
| 117 |
-
- **Real DJ Statistics** (2020): 94.5% within Β±5% BPM, transitions peak at 32-beat phrases β [arxiv:2008.10267](https://arxiv.org/abs/2008.10267)
|
| 118 |
-
- **CUE-DETR** (2024): DJ cue point detection, F1=0.46 at 16-bar phrasing β [arxiv:2407.06823](https://arxiv.org/abs/2407.06823)
|
| 119 |
-
- **All-In-One** (2023): Beat F1=0.958, Segment HR.5F=0.660 β [arxiv:2307.16425](https://arxiv.org/abs/2307.16425)
|
| 120 |
-
- **HTDemucs** (2023): 9.20 dB SDR stem separation β [arxiv:2211.08553](https://arxiv.org/abs/2211.08553)
|
| 121 |
-
- **SongFormer** (2024): Segment boundaries HR.5F=0.703 β [arxiv:2510.02797](https://arxiv.org/abs/2510.02797)
|
| 122 |
-
- **Foote (2000)**: Checkerboard kernel novelty for structural segmentation
|
| 123 |
-
- **Pioneer DJM-900NXS2**: 3-band EQ crossover points (250Hz / 2500Hz), beat FX parameters
|
|
|
|
| 5 |
colorTo: purple
|
| 6 |
sdk: gradio
|
| 7 |
sdk_version: 6.13.0
|
| 8 |
+
app_file: main.py
|
| 9 |
pinned: true
|
| 10 |
license: mit
|
| 11 |
suggested_hardware: zero-a10g
|
|
|
|
| 16 |
- audio
|
| 17 |
- mixing
|
| 18 |
- stem-separation
|
| 19 |
+
- demucs
|
| 20 |
short_description: AI analyzes songs, plans transitions, renders DJ sets
|
| 21 |
---
|
| 22 |
|
| 23 |
# ποΈ AI DJ Set Builder
|
| 24 |
|
| 25 |
+
Uses **demucs** stem separation for surgical drum/bass transitions with zero kick clash.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|