AEmotionStudio
/

htdemucs-models

audio-separation

stem-separation

Model card Files Files and versions

htdemucs-models / README.md

AEmotionStudio's picture

Add README

e6fd658 verified about 1 month ago

|

history blame contribute delete

2.65 kB

	---
	license: mit
	tags:
	- audio
	- audio-separation
	- stem-separation
	- demucs
	- htdemucs
	- safetensors
	- maestraea
	pipeline_tag: audio-to-audio
	---

	# HTDemucs Models (Safetensors)

	4/6-Stem Source Separation — Vocals, Drums, Bass, Other (+Guitar, Piano)

	[Original Source](https://github.com/facebookresearch/demucs) by [Facebook Research](https://github.com/facebookresearch) · MIT License

	> Converted from the original `.th` checkpoint format to safetensors for faster loading and safer deserialization. For use with [Mæstræa AI Workstation](https://github.com/AEmotionStudio/Maestraea).

	## Available Models

	\| File \| Stems \| Size \| Description \|
	\|------\|-------\|------\|-------------\|
	\| `htdemucs.safetensors` \| 4 (drums, bass, other, vocals) \| 84 MB \| Base model \|
	\| `htdemucs_ft.safetensors` \| 4 (drums, bass, other, vocals) \| 84 MB \| Fine-tuned — best quality ⭐ \|
	\| `htdemucs_6s.safetensors` \| 6 (drums, bass, other, vocals, guitar, piano) \| 55 MB \| 6-stem variant \|

	Each model has a matching `*_config.json` with architecture parameters (sources, sample rate, channels).

	## What HTDemucs Does

	HTDemucs (Hybrid Transformer Demucs) separates mixed audio into individual stems:

	- Vocals — Singing, spoken word
	- Drums — Percussion, kick, snare, hi-hat
	- Bass — Bass guitar, synth bass
	- Other — Everything else (keys, synths, FX)
	- Guitar — (6-stem model only)
	- Piano — (6-stem model only)

	### Key Features

	- Real-time capable on GPU
	- Adjustable segment size for VRAM control
	- Best-in-class separation quality (htdemucs_ft)
	- ~4–6 GB VRAM

	## Original Checkpoint URLs

	These safetensors were converted from:

	\| Model \| Original URL \|
	\|-------\|-------------\|
	\| htdemucs \| `https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/955717e8-8726e21a.th` \|
	\| htdemucs_ft \| `https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/04573f0d-f3cf25b2.th` \|
	\| htdemucs_6s \| `https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/5c90dfd2-34c22ccb.th` \|

	## Usage with Mæstræa

	These models are automatically downloaded by the Mæstræa AI Workstation backend. They can also be used directly with the `demucs` library:

	```python
	from demucs.pretrained import get_model
	model = get_model("htdemucs_ft")
	```

	## License

	MIT — same as the original Demucs release.

	## Credits

	- Model: [Facebook Research / Meta AI](https://github.com/facebookresearch/demucs)
	- Paper: [Hybrid Transformers for Music Source Separation](https://arxiv.org/abs/2211.08553) (Rouard et al., 2023)
	- Conversion & Mirror by: [AEmotionStudio](https://huggingface.co/AEmotionStudio)