Cocktail-Fork-MRX (MLX)

Apple MLX port of MERL's MRX (Multi-Resolution CrossNet) โ€” separates a soundtrack mixture into three stems: music, speech, and sound effects (sfx). Runs natively on Apple Silicon, no PyTorch at inference.

  • Upstream: merlresearch/cocktail-fork-separation โ€” The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks (ICASSP 2022).
  • Checkpoint: default_ (SNR-loss trained โ€” the upstream default inference weights).
  • Variants: -paper (SI-SNR, ICASSP reproduction) ยท -adapted-loudness ยท -adapted-eq (cinematic-tuned for real movie stems).
  • Collection: Cocktail-Fork MRX (MLX).
  • License: MIT.
  • Parity: numerically exact vs the PyTorch reference (full-pipeline max_abs โ‰ˆ 9e-8; per-stem SI-SDR 107โ€“139 dB vs torch).

Usage

pip install cocktail-fork-mlx   # or: pip install git+https://github.com/xocialize/cocktail-fork-mlx
cocktail-fork-mlx --audio-path soundtrack.wav --out-dir ./out
# -> out/music.wav  out/speech.wav  out/sfx.wav
import mlx.core as mx, soundfile as sf, numpy as np
from cocktail_fork_mlx.separate import separate_soundtrack
from cocktail_fork_mlx.weights import from_pretrained

audio, fs = sf.read("soundtrack.wav", always_2d=True)   # 44.1 kHz
model = from_pretrained("mlx-community/Cocktail-Fork-MRX")
stems = separate_soundtrack(mx.array(audio.T.astype("float32")), model)
for name, x in stems.items():
    sf.write(f"{name}.wav", np.array(x).T, 44100)

Model

  • 44.1 kHz, any channel count. ~30.6M params, fp32 (122 MB).
  • Multi-resolution STFT (windows 1024/2048/8192, hop 256) โ†’ per-resolution magnitude encoders โ†’ 3 parallel bidirectional CrossNet LSTMs โ†’ per-source/per-resolution mask decoders โ†’ masked iSTFT summed across resolutions.
  • CPU is the faster device for this LSTM-bound model (default in the CLI).

Ported by MVS Collective (xocialize). MIT, ยฉ MERL for the original model/weights.

Downloads last month
28
Safetensors
Model size
30.6M params
Tensor type
F32
ยท
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Collection including mlx-community/Cocktail-Fork-MRX