PitchFlower / README.md
diegotg343's picture
Upload 2 files
d2e9067 verified
|
Raw
History Blame Contribute Delete
2.27 kB
---
license: cc-by-nc-sa-4.0
---
# 🌸 PitchFlower
<p align="left">
<a href="https://arxiv.org/abs/2510.25566">
<img src="https://img.shields.io/badge/arXiv-PitchFlower-b31b1b?logo=arxiv&logoColor=white" alt="arXiv">
</a>
<a href="https://github.com/diegotg2000/PitchFlower">
<img src="https://img.shields.io/badge/GitHub-PitchFlower-181717?logo=github" alt="GitHub">
</a>
</p>
Official pretrained checkpoint of the paper *PitchFlower: A flow-based neural audio codec with pitch controllability*.
## 🧠 Overview
PitchFlower achieves pitch controllability by means of a perturbation strategy. During inference, pitch information is removed by applying a random flattening and shifting operation. The model is trained with a reconstruction task, providing pitch information explicitly.
<p align="center">
<img src="pitchflower_diagram.png" alt="PitchFlower architecture" width="600">
</p>
We use an autoencoder with an RVQ bottleneck and a flow-based decoder to produce high-quality audio. More details can be found in the paper.
## πŸ“¦ Installation and Usage
Check out our GitHub repo to learn how to use PitchFlower https://github.com/diegotg2000/PitchFlower
## πŸ™Œ Acknowledgements
We'd like to acknowledge the repositories from which we draw inspiration and parts of the code
- Vocos: https://github.com/gemelo-ai/vocos
- WavTokenizer: https://github.com/jishengpeng/WavTokenizer
- Encodec: https://github.com/facebookresearch/encodec
This work has been done in the [Analysis/Synthesis team of the STMS laboratory](https://www.stms-lab.fr/team/analyse-et-synthese-des-sons/) at IRCAM. It has been funded by the [ANR project EVA](https://anr.fr/Project-ANR-23-CE23-0018).
## πŸ“« Contact
For questions or collaboration opportunities, feel free to reach out: dtorres@ircam.fr
## 🧩 Citation
```bibtex
@misc{pitchflower,
title={PitchFlower: A flow-based neural audio codec with pitch controllability},
author={Diego Torres and Axel Roebel and Nicolas Obin},
year={2025},
eprint={2510.25566},
archivePrefix={arXiv},
url={https://arxiv.org/abs/2510.25566},
}
```
## πŸ“œ License
This project is licensed under the [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) license.