Joker5514
/

VoiceIsolate-Models

voice-isolation

Model card Files Files and versions

VoiceIsolate-Models / README.md

Joker5514's picture

Update README.md

7102e14 verified 4 days ago

|

history blame contribute delete

1.58 kB

	---
	license: mit
	language:
	- en
	tags:
	- onnx
	- audio
	- voice-isolation
	- speech
	- demucs
	- bsrnn
	pipeline_tag: audio-to-audio
	library_name: onnxruntime
	---

	# VoiceIsolate-Models

	Quantized ONNX models used by [VoiceIsolate Pro](https://github.com/Joker5514/VoiceIsolate-Pro) for on-device, GPU-accelerated voice isolation and audio enhancement.

	All inference runs 100% client-side in the browser via ONNX Runtime Web (WebGPU / WASM fallback). No server required.

	## Models in this Repository

	\| File \| Description \| Size \| Source \|
	\|---\|---\|---\|---\|
	\| `demucs_v4_quantized.onnx` \| Demucs v4 HTDemucs int8-quantized — stem-level voice isolation \| ~83 MB \| [facebookresearch/demucs](https://github.com/facebookresearch/demucs) \|
	\| `bsrnn_vocals.onnx` \| BSRNN Band-Split RNN vocals separator \| ~45 MB \| [crlandsc/bsrnn](https://github.com/crlandsc/bsrnn) \|

	## Usage

	These models are fetched automatically by `ml-worker-fetch-cache.js` in VoiceIsolate Pro. They are cached in IndexedDB after the first download and never re-fetched.

	```js
	// MODEL_REGISTRY entry in ml-worker-fetch-cache.js
	demucs_v4: {
	path: 'models/demucs_v4_quantized.onnx',
	sizeBytes: 87_031_808,
	cdnUrls: ['https://huggingface.co/Joker5514/VoiceIsolate-Models/resolve/main/demucs_v4_quantized.onnx']
	},
	bsrnn_vocals: {
	path: 'models/bsrnn_vocals.onnx',
	sizeBytes: 3_870_554,
	cdnUrls: ['https://huggingface.co/Joker5514/VoiceIsolate-Models/resolve/main/bsrnn_vocals.onnx']
	}
	```

	## License

	MIT. Model weights inherit the licenses of their respective upstream projects:
	- Demucs: MIT
	- BSRNN: MIT