AndreasXi
/

MeanAudio

Model card Files Files and versions

AndreasXi commited on Aug 17, 2025

Commit

fbab4e1

·

verified ·

1 Parent(s): c0a034a

Update README.md

Files changed (1) hide show

README.md +60 -3

README.md CHANGED Viewed

@@ -1,3 +1,60 @@
----
-license: mit
----

+---
+license: mit
+---
+<div align="center">
+<p align="center">
+  <h1>MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows</h1>
+  <!-- <a href=>Paper</a> | <a href="https://meanaudio.github.io/">Webpage</a>  -->
+  [![Webpage](https://img.shields.io/badge/Website-Visit-orange)](https://meanaudio.github.io/)
+  [![Hugging Face Model](https://img.shields.io/badge/Hugging%20Face-Model-brightgreen)](https://huggingface.co/junxiliu/MeanAudio)
+  [![Paper](https://img.shields.io/badge/Paper-DOI-blue)](https://arxiv.org/abs/2508.06098)
+  [![Code](https://img.shields.io/badge/Code-Repo-black?style=flat&logo=github&logoColor=white)](https://github.com/xiquan-li/MeanAudio?tab=readme-ov-file)
+</p>
+</div>
+## Overview
+MeanAudio is a novel MeanFlow-based model tailored for fast and faithful text-to-audio generation. It can synthesize realistic sound in a single step, achieving a real-time factor (RTF) of 0.013 on a single NVIDIA 3090 GPU. Moreover, it also demonstrates strong performance in multi-step generation.
+## Environmental Setup
+**1. Create a new conda environment:**
+```bash
+conda create -n meanaudio python=3.11 -y
+conda activate meanaudio
+pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 --upgrade
+```
+<!-- ```
+conda install -c conda-forge 'ffmpeg<7
+```
+(Optional, if you use miniforge and don't already have the appropriate ffmpeg) -->
+**2. Install with pip:**
+```bash
+git clone https://github.com/xiquan-li/MeanAudio.git
+cd MeanAudio
+pip install -e .
+```
+<!-- (If you encounter the File "setup.py" not found error, upgrade your pip with pip install --upgrade pip) -->
+## Quick Start
+<!-- **1. Download pre-trained models:** -->
+To generate audio with our pre-trained model, simply run:
+```bash
+python demo.py --prompt 'your prompt' --num_steps 1
+```
+This will automatically download the pre-trained checkpoints from huggingface, and generate audio according to your prompt.
+The output audio will be at `MeanAudio/output/`, and the checkpoints will be at `MeanAudio/weights/`.
+Have fun !!!