Update README.md
Browse files
README.md
CHANGED
|
@@ -6,7 +6,6 @@ pipeline_tag: text-to-audio
|
|
| 6 |
|
| 7 |
# Foley-Omni
|
| 8 |
|
| 9 |
-
**Foley-Omni: A Unified Multimodal Generation Model from Task-Level Audio Synthesis to Complete Video Soundtrack Generation**
|
| 10 |
|
| 11 |
[GitHub Code](https://github.com/NJU-Speech/Foley-Omni) | [arXiv](https://arxiv.org/abs/2606.03672) | [Demo](https://ty0402.github.io/Foley-omni-Web/)
|
| 12 |
|
|
@@ -15,6 +14,8 @@ pipeline_tag: text-to-audio
|
|
| 15 |
This repository packages the public inference checkpoint set for **Foley-Omni**.
|
| 16 |
The release focuses on **Video-to-Soundtrack (V2ST)** generation, where the model jointly generates synchronized **speech**, **sound effects**, and **music** from a video and optional text prompt.
|
| 17 |
|
|
|
|
|
|
|
| 18 |
|
| 19 |
## Repository Contents
|
| 20 |
|
|
|
|
| 6 |
|
| 7 |
# Foley-Omni
|
| 8 |
|
|
|
|
| 9 |
|
| 10 |
[GitHub Code](https://github.com/NJU-Speech/Foley-Omni) | [arXiv](https://arxiv.org/abs/2606.03672) | [Demo](https://ty0402.github.io/Foley-omni-Web/)
|
| 11 |
|
|
|
|
| 14 |
This repository packages the public inference checkpoint set for **Foley-Omni**.
|
| 15 |
The release focuses on **Video-to-Soundtrack (V2ST)** generation, where the model jointly generates synchronized **speech**, **sound effects**, and **music** from a video and optional text prompt.
|
| 16 |
|
| 17 |
+
## Model Size
|
| 18 |
+
5.5B
|
| 19 |
|
| 20 |
## Repository Contents
|
| 21 |
|