JaCappella Corpus: A Japanese a Cappella Vocal Ensemble Corpus
Paper
•
2211.16028
•
Published
This model was trained by Tomohiko Nakamura using the codebase).
It was trained on the vocal ensemble separation task of the jaCappella dataset.
The paper was published in ICASSP 2023 (arXiv).
See the jaCappella dataset page.
See the jaCappella dataset page.
data:
num_workers: 12
sample_rate: 48000
samples_per_track: 13
seed: 42
seq_dur: 6.0
source_augmentations:
- gain
sources:
- vocal_percussion
- bass
- alto
- tenor
- soprano
- lead_vocal
model:
bandwidth: 16000
bidirectional: true
hidden_size: 512
in_chan: 4096
nb_channels: 1
nhop: 1024
pretrained: null
spec_power: 1
window_length: 4096
optim:
lr: 0.001
lr_decay_gamma: 0.3
lr_decay_patience: 80
optimizer: adam
patience: 1000
weight_decay: 1.0e-05
training:
batch_size: 16
epochs: 1000
loss_combine_sources: true
loss_use_multidomain: true
mix_coef: 10.0
val_dur: 80.0
| Method | Lead vocal | Soprano | Alto | Tenor | Bass | Vocal percussion |
|---|---|---|---|---|---|---|
| X-UMX | 7.5 | 10.7 | 13.5 | 10.2 | 9.1 | 21.0 |