jerryhai
commited on
Commit
·
88207f3
1
Parent(s):
6e24bd7
Track binary files with Git LFS
Browse files- .ipynb_checkpoints/README-checkpoint.md +86 -0
- README.md +1 -1
.ipynb_checkpoints/README-checkpoint.md
ADDED
|
@@ -0,0 +1,86 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<img src="img\cover.jpg">
|
| 2 |
+
|
| 3 |
+
# Diff-Pitcher (PyTorch)
|
| 4 |
+
|
| 5 |
+
Official Pytorch Implementation of [Diff-Pitcher: Diffusion-based Singing Voice Pitch Correction](https://engineering.jhu.edu/lcap/data/uploads/pdfs/waspaa2023_hai.pdf)
|
| 6 |
+
|
| 7 |
+
--------------------
|
| 8 |
+
|
| 9 |
+
Thank you all for your interest in this research project. I am currently optimizing the model's performance and computation efficiency. I plan to release a user-friendly version, either a GUI or a VST, in the first half of this year, and will update the open-source license.
|
| 10 |
+
|
| 11 |
+
If you are familiar with PyTorch, you can follow [Code Examples](#examples) to use Diff-Pitcher.
|
| 12 |
+
|
| 13 |
+
--------------------
|
| 14 |
+
|
| 15 |
+
Diff-Pitcher
|
| 16 |
+
|
| 17 |
+
- [Demo Page](#demo)
|
| 18 |
+
- [Todo List](#todo)
|
| 19 |
+
- [Code Examples](#examples)
|
| 20 |
+
- [References](#references)
|
| 21 |
+
- [Acknowledgement](#acknowledgement)
|
| 22 |
+
|
| 23 |
+
## Demo
|
| 24 |
+
|
| 25 |
+
🎵 Listen to [examples](https://jhu-lcap.github.io/Diff-Pitcher/)
|
| 26 |
+
|
| 27 |
+
## Todo
|
| 28 |
+
- [x] Update codes and demo
|
| 29 |
+
- [x] Support 🤗 [Diffusers](https://github.com/huggingface/diffusers)
|
| 30 |
+
- [x] Upload checkpoints
|
| 31 |
+
- [x] Pipeline tutorial
|
| 32 |
+
- [ ] Merge to [Your-Stable-Audio](https://github.com/haidog-yaqub/Your-Stable-Audio)
|
| 33 |
+
- [ ] Audio Plugin Support
|
| 34 |
+
## Examples
|
| 35 |
+
- Download checkpoints: 🎒[ckpts](https://github.com/haidog-yaqub/DiffPitcher/tree/main/ckpts)
|
| 36 |
+
- Prepare environment: [requirements.txt](requirements.txt)
|
| 37 |
+
- Feel free to try:
|
| 38 |
+
- template-based automatic pitch correction: [template_based_apc.py](template_based_apc.py)
|
| 39 |
+
- score-based automatic pitch correction: [score_based_apc.py](score_based_apc.py)
|
| 40 |
+
|
| 41 |
+
|
| 42 |
+
## References
|
| 43 |
+
|
| 44 |
+
If you find the code useful for your research, please consider citing:
|
| 45 |
+
|
| 46 |
+
```bibtex
|
| 47 |
+
@inproceedings{hai2023diff,
|
| 48 |
+
title={Diff-Pitcher: Diffusion-Based Singing Voice Pitch Correction},
|
| 49 |
+
author={Hai, Jiarui and Elhilali, Mounya},
|
| 50 |
+
booktitle={2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
|
| 51 |
+
pages={1--5},
|
| 52 |
+
year={2023},
|
| 53 |
+
organization={IEEE}
|
| 54 |
+
}
|
| 55 |
+
```
|
| 56 |
+
|
| 57 |
+
This repo is inspired by:
|
| 58 |
+
|
| 59 |
+
```bibtex
|
| 60 |
+
@article{popov2021diffusion,
|
| 61 |
+
title={Diffusion-based voice conversion with fast maximum likelihood sampling scheme},
|
| 62 |
+
author={Popov, Vadim and Vovk, Ivan and Gogoryan, Vladimir and Sadekova, Tasnima and Kudinov, Mikhail and Wei, Jiansheng},
|
| 63 |
+
journal={arXiv preprint arXiv:2109.13821},
|
| 64 |
+
year={2021}
|
| 65 |
+
}
|
| 66 |
+
```
|
| 67 |
+
```bibtex
|
| 68 |
+
@inproceedings{liu2022diffsinger,
|
| 69 |
+
title={Diffsinger: Singing voice synthesis via shallow diffusion mechanism},
|
| 70 |
+
author={Liu, Jinglin and Li, Chengxi and Ren, Yi and Chen, Feiyang and Zhao, Zhou},
|
| 71 |
+
booktitle={Proceedings of the AAAI conference on artificial intelligence},
|
| 72 |
+
volume={36},
|
| 73 |
+
number={10},
|
| 74 |
+
pages={11020--11028},
|
| 75 |
+
year={2022}
|
| 76 |
+
}
|
| 77 |
+
```
|
| 78 |
+
|
| 79 |
+
## Acknowledgement
|
| 80 |
+
|
| 81 |
+
[Welcome to LCAP! < LCAP (jhu.edu)](https://engineering.jhu.edu/lcap/)
|
| 82 |
+
|
| 83 |
+
We borrow code from following repos:
|
| 84 |
+
|
| 85 |
+
- `Diffusion Schedulers` are based on 🤗 [Diffusers](https://github.com/huggingface/diffusers)
|
| 86 |
+
- `2D UNet` is based on [DiffVC](https://github.com/huawei-noah/Speech-Backbones/tree/main/DiffVC)
|
README.md
CHANGED
|
@@ -1,4 +1,4 @@
|
|
| 1 |
-
<img src="img\cover.
|
| 2 |
|
| 3 |
# Diff-Pitcher (PyTorch)
|
| 4 |
|
|
|
|
| 1 |
+
<img src="img\cover.jpg">
|
| 2 |
|
| 3 |
# Diff-Pitcher (PyTorch)
|
| 4 |
|