Cocktail-Fork-MRX — `paper` variant (MLX)

Apple MLX port of MERL's MRX (Multi-Resolution CrossNet) — separates a soundtrack mixture into music, speech, and sound effects (sfx).

This variant uses the paper_ checkpoint: trained with the scale-invariant SNR loss, reproducing the results from the ICASSP 2022 paper. Use this for benchmark comparability with the original publication.

Other variants: Cocktail-Fork-MRX (default, SNR loss) · -adapted-loudness · -adapted-eq (cinematic-tuned).

Upstream: merlresearch/cocktail-fork-separation (MIT) · The Cocktail Fork Problem, ICASSP 2022.
Parity: numerically exact vs PyTorch (full-forward max_abs ≈ 2e-7).

Usage

pip install git+https://github.com/xocialize/cocktail-fork-mlx
cocktail-fork-mlx --audio-path soundtrack.wav --out-dir ./out \
    --weights mlx-community/Cocktail-Fork-MRX-paper

~30.6M params, fp32 (122 MB), 44.1 kHz. MIT, © MERL for the original model/weights.

Downloads last month: 14

Safetensors

Model size

30.6M params

Tensor type

F32

MLX

Hardware compatibility

Quantized

Collection including mlx-community/Cocktail-Fork-MRX-paper

Cocktail-Fork MRX (MLX)

Collection

MERL MRX ported to Apple MLX — 3-stem music/speech/sfx soundtrack separation. Numerically exact vs PyTorch. 4 variants. • 4 items • Updated Jun 5 • 1

Cocktail-Fork-MRX — paper variant (MLX)

Usage

Collection including mlx-community/Cocktail-Fork-MRX-paper

Cocktail-Fork-MRX — `paper` variant (MLX)