Add RE-USE MLX converted weights

07bc44c 21 days ago

5.86 kB

	---
	license: other
	license_name: nvidia-one-way-noncommercial-license-nsclv1
	license_link: https://github.com/NVlabs/HMAR/blob/main/LICENSE
	track_downloads: true
	pipeline_tag: audio-to-audio
	tasks:
	- audio-to-audio
	base_model: nvidia/RE-USE
	library_name: mlx
	tags:
	- speech-enhancement
	- universal speech enhancement
	- multiple input sampling rates
	- language-agnostic
	- audio-to-audio
	- mamba
	- re-use
	- mlx
	- mlx-swift
	- safetensors
	- apple
	---

	# RE-USE MLX Converted Weights

	This repository contains MLX-compatible converted weights for [`nvidia/RE-USE`](https://huggingface.co/nvidia/RE-USE), NVIDIA's multilingual universal speech enhancement model.

	These artifacts are intended for Apple Silicon / MLX / MLX Swift inference workflows.

	## Files

	\| File \| Description \|
	\|---\|---\|
	\| `model_mlx.safetensors` \| Converted MLX-compatible runtime weights. \|
	\| `conversion-manifest.json` \| Source revision, checksum, size, license, and conversion metadata. \|
	\| `LICENSE` \| Copy of the NVIDIA One-Way Noncommercial License (NSCLv1). \|
	\| `NOTICE` \| Attribution and repository relationship notice. \|

	## Source Model

	These weights are derived from:

	- Model: [`nvidia/RE-USE`](https://huggingface.co/nvidia/RE-USE)
	- Source revision: `761905064ea1ea882e015e20a64e2e9d28458890`
	- Source file: `model.safetensors`
	- Upstream model version: `30USEMamba_peak+GAN_tel_mic_1134k`

	This repository contains a converted derivative artifact only. It does not contain the original NVIDIA checkpoint.

	## Conversion

	The conversion changes tensor keys and layouts for MLX Swift compatibility. It does not train, fine-tune, distill, quantize, or otherwise modify the learned model behavior.

	The conversion consists of:

	- mapping source PyTorch-style parameter names to MLX Swift parameter names
	- converting convolution tensor layouts where required by the MLX implementation
	- preserving required tensor dtypes for runtime correctness
	- writing the result as `model_mlx.safetensors`

	See `conversion-manifest.json` for checksum and source metadata.

	## License / Terms of Use

	These converted weights are distributed under the NVIDIA One-Way Noncommercial License (NSCLv1).

	Use is limited to non-commercial research and educational purposes only.

	The complete license text is included in this repository as `LICENSE` and is also available from NVIDIA at:

	<https://github.com/NVlabs/HMAR/blob/main/LICENSE>

	Any source code used with these weights may be licensed separately. That source-code license does not change the license terms of these converted model weights.

	## Attribution

	Original model and checkpoint are by NVIDIA and the upstream RE-USE contributors.

	This repository is not affiliated with, sponsored by, or endorsed by NVIDIA. NVIDIA trademarks are referenced only for attribution and license notice purposes.

	## Use Case

	Researchers and general users can use these MLX-compatible converted weights to enhance the quality of their speech data.

	These weights are derived from `nvidia/RE-USE` and remain subject to the NVIDIA One-Way Noncommercial License (NSCLv1).

	## Model Architecture
	Architecture Type: Convolutional encoder, Convolutional decoder, and Mamba for time–frequency modeling <br>
	Network Architecture: Bi-directional Mamba with 30 layers <br>
	Number of model parameters: 9.6M <br>

	## Input

	Input Type(s): Audio <br>
	Input Format(s): Typically `.wav` files in compatible implementations <br>
	Input Parameters: One-Dimensional (1D) <br>
	Other Properties Related to Input: 8000 Hz - 48000 Hz Mono-channel Audio <br>

	## Output

	Output Type(s): Audio <br>
	Output Format: Typically `.wav` files in compatible implementations <br>
	Output Parameters: One-Dimensional (1D) <br>
	Other Properties Related to Output: 8000 Hz - 48000 Hz Mono-channel Audio <br>

	## Software Integration

	These converted weights are intended for MLX-compatible implementations, especially MLX Swift on Apple Silicon.

	Upstream NVIDIA RE-USE was originally documented for NVIDIA GPU-accelerated systems. See the original model card for upstream environment, hardware, and inference instructions:

	<https://huggingface.co/nvidia/RE-USE>

	## Training Data

	This repository does not modify or retrain the model.

	Training dataset details are documented by NVIDIA in the upstream RE-USE model card:

	<https://huggingface.co/nvidia/RE-USE>

	## Checksum

	`model_mlx.safetensors`

	```text
	SHA-256: d1158502eaf39d0b11d097177160ce3804454653c5d14d17921b6c274ca53237
	Size: 38583628 bytes
	```

	## Ethical Considerations

	Speech enhancement can alter evidence-like audio. Enhanced outputs should not be represented as untouched originals.

	Users should evaluate model behavior on their own data and use case before deployment. For upstream NVIDIA model quality, safety, or security concerns, see the original RE-USE model card.

	## Citation

	Please cite the original NVIDIA RE-USE model and paper when using these converted weights.

	Original model repository:

	<https://huggingface.co/nvidia/RE-USE>

	```bibtex
	@article{fu2026rethinking,
	title={Rethinking Training Targets, Architectures and Data Quality for Universal Speech Enhancement},
	author={Fu, Szu-Wei and Chao, Rong and Yang, Xuesong and Huang, Sung-Feng and Zezario, Ryandhimas E and Nasretdinov, Rauf and Juki{\'c}, Ante and Tsao, Yu and Wang, Yu-Chiang Frank},
	journal={arXiv preprint arXiv:2603.02641},
	year={2026}
	}
	```

	If you use this MLX conversion artifact directly, please also cite this repository:

	```bibtex
	@misc{cagataycalli2026reusemlx,
	title = {RE-USE MLX Converted Weights},
	author = {Çağatay Çallı},
	year = {2026},
	publisher = {Hugging Face},
	howpublished = {\url{https://huggingface.co/faraday/re-use-mlx}},
	note = {MLX-compatible converted weights derived from NVIDIA RE-USE}
	}
	```