File size: 2,037 Bytes
b508693
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
# AudioSep: Separate Anything You Describe

## Input

* **Mixed audio file**

Audio file in wav format with mixed sources. [input.wav](./input.wav)

https://github.com/axinc-ai/ailia-models/assets/53651931/4b761212-a1c7-46dc-b598-a08e4c5ab7ff

This audio file was adapted from the [official audiosep implementation](https://github.com/Audio-AGI/AudioSep)

https://audio-agi.github.io/Separate-Anything-You-Describe/demos/exp31_water/drops_mixture.wav

* **Text condition**

Text description of the sound source you want to separate.

## Output

* **Audio file**

Separated audio source according to the text query.

Saves to ```./output.wav``` by default but it can be specified with the ```--path``` option 

## Usage
Internet connection is required when running the script for the first time, as the model files will be automatically downloaded.

Running this script will separate sound sources from the original input audio file, according to the language query.

#### Example1: Extract sound of thunder
```bash
$ python3 audiosep.py -p "thunder" -i input.wav -s output_thunder.wav
```
https://github.com/axinc-ai/ailia-models/assets/53651931/d0d016dd-a808-4eb6-a4b5-9791f8f1bd2f

#### Example2: Extract sound of waterdrops
```bash
$ python3 audiosep.py -p "water drops" -i input.wav -s output_waterdrops.wav
```
https://github.com/axinc-ai/ailia-models/assets/53651931/7710b6c9-49dc-4d2a-8489-ccbf7fb45591

```.wav``` file containing the sound source separated from the original mixture will be created in both cases.

## Reference

* [AudioSep](https://github.com/Audio-AGI/AudioSep)
* [Separate Anything You Describe](https://audio-agi.github.io/Separate-Anything-You-Describe/)

## Framework

Pytorch

## Model Format

ONNX opset=11

## Netron

* [audiosep_text.onnx.prototxt](https://netron.app/?url=https://storage.googleapis.com/ailia-models/audiosep/audiosep_text.onnx.prototxt)
* [audiosep_resunet.onnx.prototxt](https://netron.app/?url=https://storage.googleapis.com/ailia-models/audiosep/audiosep_resunet.onnx.prototxt)