Open-Unmix HQ (MLX)

MLX-converted Open-Unmix HQ for music source separation on Apple Silicon.

Separates stereo music into 4 stems: vocals, drums, bass, other. 4.3x real-time on M2 Max.

Model Details

Architecture: BiLSTM + FC encoder/decoder with magnitude masking
Parameters: 8.9M per stem (4 stems = 35.6M total)
Hidden size: 512
Input: Stereo 44.1kHz audio
Output: 4 stereo WAV stems
Format: safetensors (MLX-compatible)
Size: ~34 MB per stem, ~136 MB total

Benchmark (MUSDB18-HQ, 50 tracks, M2 Max)

Target	SDR (dB)
Vocals	6.23
Drums	6.44
Bass	4.56
Other	3.41

RTF 0.23 (4.3x real-time).

Usage

Used by speech-swift:

audio separate song.wav

let separator = try await SourceSeparator.fromPretrained()
let stems = separator.separate(audio: stereoAudio, sampleRate: 44100)
// stems[.vocals], stems[.drums], stems[.bass], stems[.other]

Files

vocals.safetensors — Vocals model (34 MB)
drums.safetensors — Drums model (34 MB)
bass.safetensors — Bass model (34 MB)
other.safetensors — Other/accompaniment model (34 MB)
config.json — Model configuration

Reference

Open-Unmix (GitHub)
Stöter et al., "Open-Unmix — A Reference Implementation for Music Source Separation" (JOSS, 2019)

License

MIT (same as original Open-Unmix)

Downloads last month: 78

MLX

Hardware compatibility

Quantized

Collection including aufklarer/OpenUnmix-HQ-MLX

MLX Speech Models

Collection

Speech AI models for Apple Silicon via MLX. ASR, TTS, VAD, diarization, speaker embedding. • 53 items • Updated about 14 hours ago • 4