Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,60 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: mit
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
<div align="center">
|
| 6 |
+
<p align="center">
|
| 7 |
+
<h1>MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows</h1>
|
| 8 |
+
<!-- <a href=>Paper</a> | <a href="https://meanaudio.github.io/">Webpage</a> -->
|
| 9 |
+
|
| 10 |
+
[](https://meanaudio.github.io/)
|
| 11 |
+
[](https://huggingface.co/junxiliu/MeanAudio)
|
| 12 |
+
[](https://arxiv.org/abs/2508.06098)
|
| 13 |
+
[](https://github.com/xiquan-li/MeanAudio?tab=readme-ov-file)
|
| 14 |
+
|
| 15 |
+
|
| 16 |
+
</p>
|
| 17 |
+
</div>
|
| 18 |
+
|
| 19 |
+
|
| 20 |
+
## Overview
|
| 21 |
+
MeanAudio is a novel MeanFlow-based model tailored for fast and faithful text-to-audio generation. It can synthesize realistic sound in a single step, achieving a real-time factor (RTF) of 0.013 on a single NVIDIA 3090 GPU. Moreover, it also demonstrates strong performance in multi-step generation.
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
## Environmental Setup
|
| 25 |
+
|
| 26 |
+
**1. Create a new conda environment:**
|
| 27 |
+
|
| 28 |
+
```bash
|
| 29 |
+
conda create -n meanaudio python=3.11 -y
|
| 30 |
+
conda activate meanaudio
|
| 31 |
+
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 --upgrade
|
| 32 |
+
```
|
| 33 |
+
<!-- ```
|
| 34 |
+
conda install -c conda-forge 'ffmpeg<7
|
| 35 |
+
```
|
| 36 |
+
(Optional, if you use miniforge and don't already have the appropriate ffmpeg) -->
|
| 37 |
+
|
| 38 |
+
**2. Install with pip:**
|
| 39 |
+
|
| 40 |
+
```bash
|
| 41 |
+
git clone https://github.com/xiquan-li/MeanAudio.git
|
| 42 |
+
|
| 43 |
+
cd MeanAudio
|
| 44 |
+
pip install -e .
|
| 45 |
+
```
|
| 46 |
+
|
| 47 |
+
<!-- (If you encounter the File "setup.py" not found error, upgrade your pip with pip install --upgrade pip) -->
|
| 48 |
+
|
| 49 |
+
|
| 50 |
+
## Quick Start
|
| 51 |
+
|
| 52 |
+
<!-- **1. Download pre-trained models:** -->
|
| 53 |
+
To generate audio with our pre-trained model, simply run:
|
| 54 |
+
```bash
|
| 55 |
+
python demo.py --prompt 'your prompt' --num_steps 1
|
| 56 |
+
```
|
| 57 |
+
This will automatically download the pre-trained checkpoints from huggingface, and generate audio according to your prompt.
|
| 58 |
+
The output audio will be at `MeanAudio/output/`, and the checkpoints will be at `MeanAudio/weights/`.
|
| 59 |
+
|
| 60 |
+
Have fun !!!
|