C4G-HKUST commited on
Commit
8216d98
·
1 Parent(s): 32942ab

Update quality mode duration calculation: add 60s preprocessing time, change formula to 60s + video_seconds × steps × 2.5s

Browse files
Files changed (2) hide show
  1. README.md +9 -12
  2. app.py +4 -4
README.md CHANGED
@@ -212,21 +212,18 @@ python app.py
212
  #### Generation Modes
213
  The Gradio demo provides two generation modes:
214
 
215
- - **Fast Mode (up to 210s GPU budget)**:
216
- - Fixed 10 denoising steps for quick generation
217
- - Suitable for single-person videos or quick previews
218
- - Lower GPU usage quota consumption
219
- - The 210s is the maximum GPU allocation time (budget), not the actual generation time
220
- - **For free-tier users: Fast mode can generate approximately 6 seconds of two-person video at most; longer videos may timeout.**
221
-
222
- - **Quality Mode (up to 720s GPU budget)**:
223
- - Custom denoising steps (adjustable via "Diffusion steps" slider)
224
  - Recommended for multi-person videos that require higher quality
 
225
  - Longer generation time but better quality output
226
- - The 720s is the maximum GPU allocation time (budget), not the actual generation time
227
- - With 40 denoising steps, approximately 10 seconds of video can be generated
228
 
229
- **Design Rationale**: Multi-person videos generally have longer duration and require more computational resources. To achieve better quality, especially for complex multi-person interactions, more denoising steps and longer GPU allocation time are needed. The Quality Mode provides sufficient Usage Quota (up to 720 seconds) to accommodate these requirements, while the Fast Mode offers a quick preview option with fixed 10 steps for faster iteration. Note that the GPU duration values (210s/720s) represent the maximum budget allocated, not the actual generation time.
230
 
231
 
232
 
 
212
  #### Generation Modes
213
  The Gradio demo provides two generation modes:
214
 
215
+ - **Fast Mode (120s GPU budget, suitable for any type of users)**:
216
+ - Fixed 8 denoising steps for quick generation
217
+ - Maximum video duration: 4 seconds
218
+ - Audio inputs longer than 4 seconds will be automatically trimmed to 4 seconds
219
+
220
+ - **Quality Mode (Dynamic GPU budget)**:
221
+ - Custom denoising steps (adjustable via "Diffusion steps" slider, default: 25 steps)
 
 
222
  - Recommended for multi-person videos that require higher quality
223
+ - GPU duration is dynamically calculated as: **60s (preprocessing) + video_seconds × steps × 2.5s**
224
  - Longer generation time but better quality output
 
 
225
 
226
+ **Design Rationale**: Multi-person videos generally have longer duration and require more computational resources. To achieve better quality, especially for complex multi-person interactions, more denoising steps and longer GPU allocation time are needed. The Quality Mode dynamically allocates GPU time based on video length and denoising steps, while the Fast Mode offers a quick preview option with fixed 8 steps and a 4-second maximum duration for faster iteration.
227
 
228
 
229
 
app.py CHANGED
@@ -648,9 +648,9 @@ def run_graio_demo(args):
648
  def get_duration(video_seconds, steps):
649
  """
650
  计算quality模式所需的GPU duration
651
- duration = 视频秒数 * 步数 * 3.5 秒
652
  """
653
- return int(video_seconds * steps * 3.5)
654
 
655
  # 为quality模式创建动态duration计算函数
656
  def calculate_quality_duration(*args, **kwargs):
@@ -932,7 +932,7 @@ def run_graio_demo(args):
932
  gr.Markdown("""
933
  **Generation Modes:**
934
  - **Fast Mode (120s GPU budget, suitable for any type of users)**: Fixed 8 denoising steps for quick generation. Maximum video duration: 4 seconds.
935
- - **Quality Mode (Dynamic GPU budget)**: Custom denoising steps (adjustable via "Diffusion steps" slider, default: 25 steps). GPU duration is dynamically calculated as: video_seconds × steps × 3.5 s.
936
 
937
  *Note: Fast mode has a fixed 120s GPU budget. Quality mode dynamically allocates GPU time based on video length and denoising steps. Multi-person videos generally require longer duration and more Usage Quota for better quality.*
938
  """)
@@ -1058,7 +1058,7 @@ def run_graio_demo(args):
1058
  # 计算实际使用的duration
1059
  actual_duration = get_duration(actual_video_seconds, actual_steps)
1060
  # 使用 gr.Info 提示用户
1061
- info_msg = f"Video generation completed! Duration used: {actual_duration}s (estimated: {actual_video_seconds:.2f}s video × {actual_steps} steps × 2s)"
1062
  gr.Info(info_msg)
1063
  return output_file
1064
  else:
 
648
  def get_duration(video_seconds, steps):
649
  """
650
  计算quality模式所需的GPU duration
651
+ duration = 60s (预处理时间) + 视频秒数 * 步数 * 2.5 秒
652
  """
653
+ return int(60 + video_seconds * steps * 2.5)
654
 
655
  # 为quality模式创建动态duration计算函数
656
  def calculate_quality_duration(*args, **kwargs):
 
932
  gr.Markdown("""
933
  **Generation Modes:**
934
  - **Fast Mode (120s GPU budget, suitable for any type of users)**: Fixed 8 denoising steps for quick generation. Maximum video duration: 4 seconds.
935
+ - **Quality Mode (Dynamic GPU budget)**: Custom denoising steps (adjustable via "Diffusion steps" slider, default: 25 steps). GPU duration is dynamically calculated as: 60s (preprocessing) + video_seconds × steps × 2.5 s.
936
 
937
  *Note: Fast mode has a fixed 120s GPU budget. Quality mode dynamically allocates GPU time based on video length and denoising steps. Multi-person videos generally require longer duration and more Usage Quota for better quality.*
938
  """)
 
1058
  # 计算实际使用的duration
1059
  actual_duration = get_duration(actual_video_seconds, actual_steps)
1060
  # 使用 gr.Info 提示用户
1061
+ info_msg = f"Video generation completed! Duration used: {actual_duration}s (60s preprocessing + {actual_video_seconds:.2f}s video × {actual_steps} steps × 2.5s)"
1062
  gr.Info(info_msg)
1063
  return output_file
1064
  else: