htdemucs-models / README.md
AEmotionStudio's picture
Add README
e6fd658 verified
---
license: mit
tags:
- audio
- audio-separation
- stem-separation
- demucs
- htdemucs
- safetensors
- maestraea
pipeline_tag: audio-to-audio
---
# HTDemucs Models (Safetensors)
**4/6-Stem Source Separation — Vocals, Drums, Bass, Other (+Guitar, Piano)**
[Original Source](https://github.com/facebookresearch/demucs) by [Facebook Research](https://github.com/facebookresearch) · MIT License
> Converted from the original `.th` checkpoint format to safetensors for faster loading and safer deserialization. For use with [Mæstræa AI Workstation](https://github.com/AEmotionStudio/Maestraea).
## Available Models
| File | Stems | Size | Description |
|------|-------|------|-------------|
| `htdemucs.safetensors` | 4 (drums, bass, other, vocals) | 84 MB | Base model |
| `htdemucs_ft.safetensors` | 4 (drums, bass, other, vocals) | 84 MB | **Fine-tuned** — best quality ⭐ |
| `htdemucs_6s.safetensors` | 6 (drums, bass, other, vocals, guitar, piano) | 55 MB | 6-stem variant |
Each model has a matching `*_config.json` with architecture parameters (sources, sample rate, channels).
## What HTDemucs Does
HTDemucs (Hybrid Transformer Demucs) separates mixed audio into individual stems:
- **Vocals** — Singing, spoken word
- **Drums** — Percussion, kick, snare, hi-hat
- **Bass** — Bass guitar, synth bass
- **Other** — Everything else (keys, synths, FX)
- **Guitar** — (6-stem model only)
- **Piano** — (6-stem model only)
### Key Features
- Real-time capable on GPU
- Adjustable segment size for VRAM control
- Best-in-class separation quality (htdemucs_ft)
- ~4–6 GB VRAM
## Original Checkpoint URLs
These safetensors were converted from:
| Model | Original URL |
|-------|-------------|
| htdemucs | `https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/955717e8-8726e21a.th` |
| htdemucs_ft | `https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/04573f0d-f3cf25b2.th` |
| htdemucs_6s | `https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/5c90dfd2-34c22ccb.th` |
## Usage with Mæstræa
These models are automatically downloaded by the Mæstræa AI Workstation backend. They can also be used directly with the `demucs` library:
```python
from demucs.pretrained import get_model
model = get_model("htdemucs_ft")
```
## License
MIT — same as the original Demucs release.
## Credits
- **Model**: [Facebook Research / Meta AI](https://github.com/facebookresearch/demucs)
- **Paper**: [Hybrid Transformers for Music Source Separation](https://arxiv.org/abs/2211.08553) (Rouard et al., 2023)
- **Conversion & Mirror by**: [AEmotionStudio](https://huggingface.co/AEmotionStudio)