|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
tags: |
|
|
- hearing loss |
|
|
- challenge |
|
|
- signal processing |
|
|
- source separation |
|
|
- audio |
|
|
- audio-to-audio |
|
|
- NonCausal |
|
|
--- |
|
|
|
|
|
# Cadenza Challenge: CAD2-Task2 |
|
|
|
|
|
A Causal separation model for the CAD2-Task2 system. |
|
|
|
|
|
This model is an ensemble of the following instruments: |
|
|
- Bassoon |
|
|
- Cello |
|
|
- Clarinet |
|
|
- Flute |
|
|
- Oboe |
|
|
- Sax |
|
|
- Viola |
|
|
- Violin |
|
|
|
|
|
Each model is based on the ConvTasNet (Kaituo XU) with multichannel support (Alexandre Defossez). |
|
|
* Parameters: |
|
|
* B: 256 |
|
|
* C: 2 |
|
|
* H: 512 |
|
|
* L: 20 |
|
|
* N: 256 |
|
|
* P: 3 |
|
|
* R: 3 |
|
|
* X: 8 |
|
|
* audio_channels: 2 |
|
|
* causal: true |
|
|
* mask_nonlinear: relu |
|
|
* norm_type: cLN |
|
|
|
|
|
## Dataset |
|
|
The model was trained using EnsembleSet and CadenzaWoodwind datasets. |
|
|
|
|
|
## How to use |
|
|
|
|
|
``` |
|
|
from dynamic_source_separator import DynamicSourceSeparator |
|
|
|
|
|
model = DynamicSourceSeparator.from_pretrained( |
|
|
"cadenzachallenge/Dynamic_Source_Separator_Causal" |
|
|
).cpu() |
|
|
|
|
|
``` |
|
|
|
|
|
## Description |
|
|
|
|
|
Audio source separation model used in Sytem T002 for [Cadenza2 Task2 Challenge](https://cadenzachallenge.org/docs/cadenza2/Rebalancing%20Classical/rebalancing) |
|
|
|
|
|
The model is a finetune of the 8 ConvTasNet models from the Task2 baseline. |
|
|
The training optimised the estimated sources and the recosntructed mixture |
|
|
|
|
|
$$ |
|
|
Loss = \sum_{}^{Sources}(L_1(estimated~source, ref~source)) + L_1(reconstructed~mixture, original~mixture) |
|
|
$$ |
|
|
```Python |
|
|
def dynamic_masked_loss(mixture, separated_sources, ground_truth_sources, indicator): |
|
|
# Reconstruction Loss |
|
|
reconstruction = sum(separated_sources.values()) |
|
|
reconstruction_loss = nn.L1Loss()(reconstruction, mixture) |
|
|
# Separation Loss |
|
|
separation_loss = 0 |
|
|
for instrument, active in indicator.items(): |
|
|
if active: |
|
|
separation_loss += nn.L1Loss()( |
|
|
separated_sources[instrument], ground_truth_sources[instrument] |
|
|
) |
|
|
return reconstruction_loss + separation_loss |
|
|
``` |
|
|
Model and T002 recipe are shared in [Clarity toolkit](https://github.com/claritychallenge/clarity) |