Instructions to use appautomaton/re-use-semamba-mlx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use appautomaton/re-use-semamba-mlx with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir re-use-semamba-mlx appautomaton/re-use-semamba-mlx
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
RE-USE SEMamba Speech Enhancement (MLX)
Pure-MLX conversion of NVIDIA RE-USE, a
~9.6M-parameter SEMamba universal speech-enhancement model. In
mlx-speech it cleans a voice
reference before VAE conditioning when DramaBox TTS
runs with denoise_ref=True, giving the cloning model a clean speaker anchor.
Non-commercial weights. These weights derive from NVIDIA RE-USE, licensed under the NVIDIA Source Code License (non-commercial). See the License section.
Model Details
- Developed by: App Automaton
- Upstream model:
nvidia/RE-USE(SEMamba, bidirectional Mamba over STFT magnitude + phase) - Role: input-side voice-reference denoiser for DramaBox
denoise_ref=True. Optional, off by default. - Conversion: format-only port of the fp32 weights to MLX
.safetensors(1416 keys, ~9.6M params). No quantization, no architecture change. - Runtime: pure MLX on Apple Silicon. The selective scan mirrors the
mamba_ssmselective_scan_refreference math, so no CUDA kernels (mamba-ssm/causal-conv1d) are required. - Parity: the MLX port matches the torch reference at amplitude-weighted complex correlation 0.9998 (model) and 0.9997 (end-to-end waveform on real speech).
Contents
| File | Component | Format | Size |
|---|---|---|---|
model.safetensors |
SEMamba enhancer | fp32 | ~38 MB |
config.json |
Model + STFT config | JSON | n/a |
How to Get Started
Used automatically by DramaBox when you opt in:
import mlx_speech
tts = mlx_speech.tts.load("dramabox")
result = tts.generate(
"Voice cloning from a noisy reference.",
reference_audio="noisy_speaker.wav",
denoise_ref=True, # cleans the reference with this model first
)
tts.load("dramabox") resolves these weights automatically. To run the enhancer
directly:
hf download appautomaton/re-use-semamba-mlx --local-dir models/reuse/mlx
from pathlib import Path
from mlx_speech.generation.reuse import REUSEEnhancer
enhancer = REUSEEnhancer.from_dir(Path("models/reuse/mlx"))
clean = enhancer.enhance(noisy_waveform, in_sr=16000) # mono in, mono out
Intended Use
Denoising a short voice-reference clip before voice cloning, so the model conditions on a clean speaker/style anchor rather than the recording's noise. The enhancer runs on the reference input, never on generated output, so the TTS model's paralinguistic events (breaths, laughs) are preserved.
Links
- Source code:
appautomaton/mlx-speech - Paired model:
appautomaton/dramabox-tts-3.3b-bf16-mlx - More from App Automaton: GitHub · Hugging Face
License
NVIDIA Source Code License (non-commercial). These weights are a format
conversion of nvidia/RE-USE and remain
governed by NVIDIA's license terms; by downloading or using them you agree to
those terms. They may not be used commercially. Set denoise_ref=False (the
default) to run DramaBox voice cloning without this model. The mlx-speech
runtime code is MIT.
- Downloads last month
- 30
Quantized
Model tree for appautomaton/re-use-semamba-mlx
Base model
nvidia/RE-USE