iBoostAI
/

Demucs-v4

@@ -1,34 +1,34 @@
----
-license: mit
-tags:
-  - audio
-  - music-source-separation
-  - sound-separation
-  - demucs
-  - htdemucs
-  - stem-separation
-  - inference
-pipeline_tag: audio-to-audio
----
-## Music Source Separation
-This is the Demucs model, serialized from Facebook Research's pretrained models.
----
-## What is HTDemucs?
-[HTDemucs (Hybrid Transformer Demucs)](https://github.com/facebookresearch/demucs) is Meta AI's fourth-generation music source separation model, introduced in [*Hybrid Transformers for Music Source Separation* (Rouard et al., ICASSP 2023)](https://arxiv.org/abs/2211.08553).
-Where earlier Demucs generations processed audio purely in the time domain, HTDemucs runs **two parallel encoders simultaneously** — one operating on the raw waveform, the other on the STFT spectrogram — with a **Transformer Encoder with cross-attention** at the bottleneck connecting them. This lets the model correlate time-domain and frequency-domain features before decoding, yielding measurably better separation quality — especially on spectrally complex, temporally sparse instruments like piano and guitar.
-The `htdemucs_6s` variant adds dedicated guitar and piano stems on top of the standard drums/bass/other/vocals quad, making it the most capable publicly available separation model for music production use.
----
-From Facebook research:
-    Demucs is based on U-Net convolutional architecture inspired by Wave-U-Net and SING, with GLUs, a BiLSTM between the encoder and decoder, specific initialization of weights and transposed convolutions in the decoder.
 See [facebookresearch's repository](https://github.com/facebookresearch/demucs) for more information on Demucs.

+---
+license: mit
+tags:
+  - audio
+  - music-source-separation
+  - sound-separation
+  - demucs
+  - htdemucs
+  - stem-separation
+  - inference
+pipeline_tag: audio-to-audio
+---
+## Music Source Separation
+This is the Demucs v4 models from Facebook Research.
+---
+## What is HTDemucs?
+[HTDemucs (Hybrid Transformer Demucs)](https://github.com/facebookresearch/demucs) is Meta AI's fourth-generation music source separation model, introduced in [*Hybrid Transformers for Music Source Separation* (Rouard et al., ICASSP 2023)](https://arxiv.org/abs/2211.08553).
+Where earlier Demucs generations processed audio purely in the time domain, HTDemucs runs **two parallel encoders simultaneously** — one operating on the raw waveform, the other on the STFT spectrogram — with a **Transformer Encoder with cross-attention** at the bottleneck connecting them. This lets the model correlate time-domain and frequency-domain features before decoding, yielding measurably better separation quality — especially on spectrally complex, temporally sparse instruments like piano and guitar.
+The `htdemucs_6s` variant adds dedicated guitar and piano stems on top of the standard drums/bass/other/vocals quad, making it the most capable publicly available separation model for music production use.
+---
+From Facebook research:
+    Demucs is based on U-Net convolutional architecture inspired by Wave-U-Net and SING, with GLUs, a BiLSTM between the encoder and decoder, specific initialization of weights and transposed convolutions in the decoder.
 See [facebookresearch's repository](https://github.com/facebookresearch/demucs) for more information on Demucs.