---
license: cc-by-nc-sa-4.0
---
# 🌸 PitchFlower
Official pretrained checkpoint of the paper *PitchFlower: A flow-based neural audio codec with pitch controllability*.
## 🧠Overview
PitchFlower achieves pitch controllability by means of a perturbation strategy. During inference, pitch information is removed by applying a random flattening and shifting operation. The model is trained with a reconstruction task, providing pitch information explicitly.
We use an autoencoder with an RVQ bottleneck and a flow-based decoder to produce high-quality audio. More details can be found in the paper.
## 📦 Installation and Usage
Check out our GitHub repo to learn how to use PitchFlower https://github.com/diegotg2000/PitchFlower
## 🙌 Acknowledgements
We'd like to acknowledge the repositories from which we draw inspiration and parts of the code
- Vocos: https://github.com/gemelo-ai/vocos
- WavTokenizer: https://github.com/jishengpeng/WavTokenizer
- Encodec: https://github.com/facebookresearch/encodec
This work has been done in the [Analysis/Synthesis team of the STMS laboratory](https://www.stms-lab.fr/team/analyse-et-synthese-des-sons/) at IRCAM. It has been funded by the [ANR project EVA](https://anr.fr/Project-ANR-23-CE23-0018).
## 📫 Contact
For questions or collaboration opportunities, feel free to reach out: dtorres@ircam.fr
## 🧩 Citation
```bibtex
@misc{pitchflower,
title={PitchFlower: A flow-based neural audio codec with pitch controllability},
author={Diego Torres and Axel Roebel and Nicolas Obin},
year={2025},
eprint={2510.25566},
archivePrefix={arXiv},
url={https://arxiv.org/abs/2510.25566},
}
```
## 📜 License
This project is licensed under the [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) license.