File size: 2,079 Bytes
a26a04c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
---
license: cc-by-nc-4.0
tags:
- audio
- music-source-separation
- source-separation
pipeline_tag: audio-to-audio
---
# Piano Source Separation Model

This repository contains a 17 MB piano separation model and inference script for running it.

The model takes an audio track as input and outputs the isolated piano.

# Examples

Listen to some examples here https://tjpurdy.github.io/Piano-Separation-Model-small/

## Input and output

- Supported input formats: `wav`, `flac`, `mp3`
- Supported output formats: `wav`, `flac` (--output_format wav / --output_format flac)
- --input_dir can point to either a single file or a directory containing multiple files
## Installation
```bash
pip install torch einops rotary-embedding-torch numpy soundfile safetensors
```
## Usage
Download the inference.py file then run the code below after setting the --input_dir (model and config will be auto-downloaded).

```bash
python inference.py --input_dir 'Insert path to file or directory containing file(s) here'
```
## Extra options

- --output_dir to choose where the outputs are saved, default is the same as --input_dir (output filenames will have _piano at the end)
- --checkpoint_path where the model is located, if not found the code will automatically download it
- --config_path where the config.json is located, if not found the code will automatically download it
## Notes
- This model is trained for the typical common piano only, it will not work on variants such as the electric piano.
- Uses GPU (3GB VRAM required) automatically if available, CPU is used otherwise
- The model is trained with 44.1 kHz audio
- Processing speed of ~1 second per 1 minute of audio on a google colab T4.
## Citation
Please cite this repository if you use this model in research or a project.
## Credit
Wei-Tsung Lu, Ju-Chiang Wang, Qiuqiang Kong, Yun-Ning Hung - https://arxiv.org/abs/2309.02612
lucidrains - https://github.com/lucidrains/BS-RoFormer
<p align=>
  <img alt="train-loss" src="https://raw.githubusercontent.com/tjpurdy/Piano-Separation-Model-small/main/docs/trainloss.png">
</p>