akhaliq HF Staff commited on
Commit
5fb66a1
·
verified ·
1 Parent(s): 03a98e9

Update Gradio app with multiple files

Browse files
Files changed (1) hide show
  1. app.py +20 -7
app.py CHANGED
@@ -122,11 +122,11 @@ with gr.Blocks(
122
 
123
  gr.Markdown(
124
  """
125
- # 🎬 Image to Video Generator
126
 
127
- Transform your static images into dynamic videos using AI! Upload an image and describe the motion you want to see.
128
 
129
- Powered by the **Ovi** model via HuggingFace Inference API.
130
  """
131
  )
132
 
@@ -137,7 +137,8 @@ with gr.Blocks(
137
  <ul>
138
  <li>Use clear, well-lit images with a single main subject</li>
139
  <li>Write specific prompts describing the desired motion or action</li>
140
- <li>Keep prompts concise and focused on movement</li>
 
141
  <li>Processing may take 30-60 seconds depending on server load</li>
142
  </ul>
143
  </div>
@@ -182,10 +183,19 @@ with gr.Blocks(
182
 
183
  gr.Markdown(
184
  """
185
- ### About the Model
186
 
187
- This app uses the **Ovi** model, which specializes in generating realistic video animations from static images.
188
- The model can understand natural language prompts to create various types of motion and animation.
 
 
 
 
 
 
 
 
 
189
  """
190
  )
191
 
@@ -216,6 +226,8 @@ with gr.Blocks(
216
  ### ⚠️ Notes
217
 
218
  - Video generation may take 30-60 seconds
 
 
219
  - Requires a valid HuggingFace token with Inference API access
220
  - Best results with clear, high-quality images
221
  - The model works best with realistic subjects and natural motions
@@ -224,6 +236,7 @@ with gr.Blocks(
224
 
225
  - [Ovi Model Card](https://huggingface.co/chetwinlow1/Ovi)
226
  - [HuggingFace Inference API](https://huggingface.co/docs/huggingface_hub/guides/inference)
 
227
  """
228
  )
229
 
 
122
 
123
  gr.Markdown(
124
  """
125
+ # 🎬 Image to Video Generator with Ovi
126
 
127
+ Transform your static images into dynamic videos with synchronized audio using AI! Upload an image and describe the motion you want to see.
128
 
129
+ Powered by **Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation** via HuggingFace Inference API.
130
  """
131
  )
132
 
 
137
  <ul>
138
  <li>Use clear, well-lit images with a single main subject</li>
139
  <li>Write specific prompts describing the desired motion or action</li>
140
+ <li>Keep prompts concise and focused on movement and audio elements</li>
141
+ <li>Processing generates 5-second videos at 24 FPS with synchronized audio</li>
142
  <li>Processing may take 30-60 seconds depending on server load</li>
143
  </ul>
144
  </div>
 
183
 
184
  gr.Markdown(
185
  """
186
+ ### About Ovi Model
187
 
188
+ **Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation**
189
+
190
+ Developed by Chetwin Low, Weimin Wang (Character AI) & Calder Katyal (Yale University)
191
+
192
+ 🌟 **Key Features:**
193
+ - 🎬 **Video+Audio Generation**: Generates synchronized video and audio content simultaneously
194
+ - 📝 **Flexible Input**: Supports text-only or text+image conditioning
195
+ - ⏱️ **5-second Videos**: Generates 5-second videos at 24 FPS
196
+ - 📐 **Multiple Aspect Ratios**: Supports 720×720 area at various ratios (9:16, 16:9, 1:1, etc)
197
+
198
+ Ovi is a veo-3 like model that uses twin backbone cross-modal fusion for high-quality audio-video generation.
199
  """
200
  )
201
 
 
226
  ### ⚠️ Notes
227
 
228
  - Video generation may take 30-60 seconds
229
+ - Generates 5-second videos at 24 FPS with synchronized audio
230
+ - Supports multiple aspect ratios (9:16, 16:9, 1:1, etc) at 720×720 area
231
  - Requires a valid HuggingFace token with Inference API access
232
  - Best results with clear, high-quality images
233
  - The model works best with realistic subjects and natural motions
 
236
 
237
  - [Ovi Model Card](https://huggingface.co/chetwinlow1/Ovi)
238
  - [HuggingFace Inference API](https://huggingface.co/docs/huggingface_hub/guides/inference)
239
+ - [Character AI](https://character.ai)
240
  """
241
  )
242