iBoostAI commited on
Commit
bffc1c1
·
verified ·
1 Parent(s): 47b03c8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -33
README.md CHANGED
@@ -1,34 +1,34 @@
1
- ---
2
- license: mit
3
- tags:
4
- - audio
5
- - music-source-separation
6
- - sound-separation
7
- - demucs
8
- - htdemucs
9
- - stem-separation
10
- - inference
11
- pipeline_tag: audio-to-audio
12
- ---
13
-
14
- ## Music Source Separation
15
-
16
- This is the Demucs model, serialized from Facebook Research's pretrained models.
17
-
18
- ---
19
-
20
- ## What is HTDemucs?
21
-
22
- [HTDemucs (Hybrid Transformer Demucs)](https://github.com/facebookresearch/demucs) is Meta AI's fourth-generation music source separation model, introduced in [*Hybrid Transformers for Music Source Separation* (Rouard et al., ICASSP 2023)](https://arxiv.org/abs/2211.08553).
23
-
24
- Where earlier Demucs generations processed audio purely in the time domain, HTDemucs runs **two parallel encoders simultaneously** — one operating on the raw waveform, the other on the STFT spectrogram — with a **Transformer Encoder with cross-attention** at the bottleneck connecting them. This lets the model correlate time-domain and frequency-domain features before decoding, yielding measurably better separation quality — especially on spectrally complex, temporally sparse instruments like piano and guitar.
25
-
26
- The `htdemucs_6s` variant adds dedicated guitar and piano stems on top of the standard drums/bass/other/vocals quad, making it the most capable publicly available separation model for music production use.
27
-
28
- ---
29
-
30
- From Facebook research:
31
-
32
- Demucs is based on U-Net convolutional architecture inspired by Wave-U-Net and SING, with GLUs, a BiLSTM between the encoder and decoder, specific initialization of weights and transposed convolutions in the decoder.
33
-
34
  See [facebookresearch's repository](https://github.com/facebookresearch/demucs) for more information on Demucs.
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - audio
5
+ - music-source-separation
6
+ - sound-separation
7
+ - demucs
8
+ - htdemucs
9
+ - stem-separation
10
+ - inference
11
+ pipeline_tag: audio-to-audio
12
+ ---
13
+
14
+ ## Music Source Separation
15
+
16
+ This is the Demucs v4 models from Facebook Research.
17
+
18
+ ---
19
+
20
+ ## What is HTDemucs?
21
+
22
+ [HTDemucs (Hybrid Transformer Demucs)](https://github.com/facebookresearch/demucs) is Meta AI's fourth-generation music source separation model, introduced in [*Hybrid Transformers for Music Source Separation* (Rouard et al., ICASSP 2023)](https://arxiv.org/abs/2211.08553).
23
+
24
+ Where earlier Demucs generations processed audio purely in the time domain, HTDemucs runs **two parallel encoders simultaneously** — one operating on the raw waveform, the other on the STFT spectrogram — with a **Transformer Encoder with cross-attention** at the bottleneck connecting them. This lets the model correlate time-domain and frequency-domain features before decoding, yielding measurably better separation quality — especially on spectrally complex, temporally sparse instruments like piano and guitar.
25
+
26
+ The `htdemucs_6s` variant adds dedicated guitar and piano stems on top of the standard drums/bass/other/vocals quad, making it the most capable publicly available separation model for music production use.
27
+
28
+ ---
29
+
30
+ From Facebook research:
31
+
32
+ Demucs is based on U-Net convolutional architecture inspired by Wave-U-Net and SING, with GLUs, a BiLSTM between the encoder and decoder, specific initialization of weights and transposed convolutions in the decoder.
33
+
34
  See [facebookresearch's repository](https://github.com/facebookresearch/demucs) for more information on Demucs.