--- license: other license_name: nvidia-one-way-noncommercial-license-nsclv1 license_link: https://github.com/NVlabs/HMAR/blob/main/LICENSE track_downloads: true pipeline_tag: audio-to-audio tasks: - audio-to-audio base_model: nvidia/RE-USE library_name: mlx tags: - speech-enhancement - universal speech enhancement - multiple input sampling rates - language-agnostic - audio-to-audio - mamba - re-use - mlx - mlx-swift - safetensors - apple --- # RE-USE MLX Converted Weights This repository contains MLX-compatible converted weights for [`nvidia/RE-USE`](https://huggingface.co/nvidia/RE-USE), NVIDIA's multilingual universal speech enhancement model. These artifacts are intended for Apple Silicon / MLX / MLX Swift inference workflows. ## Files | File | Description | |---|---| | `model_mlx.safetensors` | Converted MLX-compatible runtime weights. | | `conversion-manifest.json` | Source revision, checksum, size, license, and conversion metadata. | | `LICENSE` | Copy of the NVIDIA One-Way Noncommercial License (NSCLv1). | | `NOTICE` | Attribution and repository relationship notice. | ## Source Model These weights are derived from: - Model: [`nvidia/RE-USE`](https://huggingface.co/nvidia/RE-USE) - Source revision: `761905064ea1ea882e015e20a64e2e9d28458890` - Source file: `model.safetensors` - Upstream model version: `30USEMamba_peak+GAN_tel_mic_1134k` This repository contains a converted derivative artifact only. It does not contain the original NVIDIA checkpoint. ## Conversion The conversion changes tensor keys and layouts for MLX Swift compatibility. It does not train, fine-tune, distill, quantize, or otherwise modify the learned model behavior. The conversion consists of: - mapping source PyTorch-style parameter names to MLX Swift parameter names - converting convolution tensor layouts where required by the MLX implementation - preserving required tensor dtypes for runtime correctness - writing the result as `model_mlx.safetensors` See `conversion-manifest.json` for checksum and source metadata. ## License / Terms of Use These converted weights are distributed under the **NVIDIA One-Way Noncommercial License (NSCLv1)**. Use is limited to **non-commercial research and educational purposes only**. The complete license text is included in this repository as `LICENSE` and is also available from NVIDIA at: Any source code used with these weights may be licensed separately. That source-code license does not change the license terms of these converted model weights. ## Attribution Original model and checkpoint are by NVIDIA and the upstream RE-USE contributors. This repository is not affiliated with, sponsored by, or endorsed by NVIDIA. NVIDIA trademarks are referenced only for attribution and license notice purposes. ## Use Case Researchers and general users can use these MLX-compatible converted weights to enhance the quality of their speech data. These weights are derived from `nvidia/RE-USE` and remain subject to the NVIDIA One-Way Noncommercial License (NSCLv1). ## Model Architecture **Architecture Type:** Convolutional encoder, Convolutional decoder, and Mamba for time–frequency modeling
**Network Architecture:** Bi-directional Mamba with 30 layers
**Number of model parameters:** 9.6M
## Input Input Type(s): Audio
Input Format(s): Typically `.wav` files in compatible implementations
Input Parameters: One-Dimensional (1D)
Other Properties Related to Input: 8000 Hz - 48000 Hz Mono-channel Audio
## Output Output Type(s): Audio
Output Format: Typically `.wav` files in compatible implementations
Output Parameters: One-Dimensional (1D)
Other Properties Related to Output: 8000 Hz - 48000 Hz Mono-channel Audio
## Software Integration These converted weights are intended for MLX-compatible implementations, especially MLX Swift on Apple Silicon. Upstream NVIDIA RE-USE was originally documented for NVIDIA GPU-accelerated systems. See the original model card for upstream environment, hardware, and inference instructions: ## Training Data This repository does not modify or retrain the model. Training dataset details are documented by NVIDIA in the upstream RE-USE model card: ## Checksum `model_mlx.safetensors` ```text SHA-256: d1158502eaf39d0b11d097177160ce3804454653c5d14d17921b6c274ca53237 Size: 38583628 bytes ``` ## Ethical Considerations Speech enhancement can alter evidence-like audio. Enhanced outputs should not be represented as untouched originals. Users should evaluate model behavior on their own data and use case before deployment. For upstream NVIDIA model quality, safety, or security concerns, see the original RE-USE model card. ## Citation Please cite the original NVIDIA RE-USE model and paper when using these converted weights. Original model repository: ```bibtex @article{fu2026rethinking, title={Rethinking Training Targets, Architectures and Data Quality for Universal Speech Enhancement}, author={Fu, Szu-Wei and Chao, Rong and Yang, Xuesong and Huang, Sung-Feng and Zezario, Ryandhimas E and Nasretdinov, Rauf and Juki{\'c}, Ante and Tsao, Yu and Wang, Yu-Chiang Frank}, journal={arXiv preprint arXiv:2603.02641}, year={2026} } ``` If you use this MLX conversion artifact directly, please also cite this repository: ```bibtex @misc{cagataycalli2026reusemlx, title = {RE-USE MLX Converted Weights}, author = {Çağatay Çallı}, year = {2026}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/faraday/re-use-mlx}}, note = {MLX-compatible converted weights derived from NVIDIA RE-USE} } ```