Update README.md
Browse files
README.md
CHANGED
|
@@ -6,7 +6,9 @@ library_name: stable-audio-tools
|
|
| 6 |
|
| 7 |
# AudioX
|
| 8 |
|
| 9 |
-
## 🎧 AudioX: Diffusion Transformer for Anything-to-Audio Generation
|
|
|
|
|
|
|
| 10 |
|
| 11 |
[TL;DR]: AudioX is a unified Diffusion Transformer model for Anything-to-Audio and Music Generation, capable of generating high-quality general audio and music, offering flexible natural language control, and seamlessly processing various modalities including text, video, image, music, and audio.
|
| 12 |
|
|
@@ -97,13 +99,21 @@ if video_path is not None and os.path.exists(video_path):
|
|
| 97 |
## Citation
|
| 98 |
If you find our work useful, please consider citing:
|
| 99 |
|
| 100 |
-
```
|
| 101 |
@article{tian2025audiox,
|
| 102 |
-
title={
|
| 103 |
author={Tian, Zeyue and Jin, Yizhu and Liu, Zhaoyang and Yuan, Ruibin and Tan, Xu and Chen, Qifeng and Xue, Wei and Guo, Yike},
|
| 104 |
journal={arXiv preprint arXiv:2503.10522},
|
| 105 |
year={2025}
|
| 106 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 107 |
```
|
| 108 |
|
| 109 |
|
|
|
|
| 6 |
|
| 7 |
# AudioX
|
| 8 |
|
| 9 |
+
## 🎧 [ICLR 2026] AudioX: Diffusion Transformer for Anything-to-Audio Generation
|
| 10 |
+
|
| 11 |
+
**Accepted to ICLR 2026** 🎉
|
| 12 |
|
| 13 |
[TL;DR]: AudioX is a unified Diffusion Transformer model for Anything-to-Audio and Music Generation, capable of generating high-quality general audio and music, offering flexible natural language control, and seamlessly processing various modalities including text, video, image, music, and audio.
|
| 14 |
|
|
|
|
| 99 |
## Citation
|
| 100 |
If you find our work useful, please consider citing:
|
| 101 |
|
| 102 |
+
```bibtex
|
| 103 |
@article{tian2025audiox,
|
| 104 |
+
title={Audiox: Diffusion transformer for anything-to-audio generation},
|
| 105 |
author={Tian, Zeyue and Jin, Yizhu and Liu, Zhaoyang and Yuan, Ruibin and Tan, Xu and Chen, Qifeng and Xue, Wei and Guo, Yike},
|
| 106 |
journal={arXiv preprint arXiv:2503.10522},
|
| 107 |
year={2025}
|
| 108 |
}
|
| 109 |
+
|
| 110 |
+
@inproceedings{tian2025vidmuse,
|
| 111 |
+
title={Vidmuse: A simple video-to-music generation framework with long-short-term modeling},
|
| 112 |
+
author={Tian, Zeyue and Liu, Zhaoyang and Yuan, Ruibin and Pan, Jiahao and Liu, Qifeng and Tan, Xu and Chen, Qifeng and Xue, Wei and Guo, Yike},
|
| 113 |
+
booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
|
| 114 |
+
pages={18782--18793},
|
| 115 |
+
year={2025}
|
| 116 |
+
}
|
| 117 |
```
|
| 118 |
|
| 119 |
|