Spaces:

JackIsNotInTheBox
/

Generate_Audio_for_Video

Running on Zero

BoxOfColors commited on 1 day ago

Commit

1d1ce31

1 Parent(s): 8635f79

Fix TARO GPU abort: increase load_overhead 15→120s to cover model loading time

TARO loads CAVP extractor, onset model, MMDiT, AudioLDM2 VAE, and BigVGAN
vocoder before inference. This took >60s total, exhausting the GPU budget
before _load_taro_models could complete. 120s overhead + inference time
gives enough headroom for all model loading + feature extraction.

Files changed (1) hide show

app.py +1 -1

app.py CHANGED Viewed

@@ -512,7 +512,7 @@ MODEL_CONFIGS = {
         "window_s":       TARO_MODEL_DUR,  # 8.192 s
         "sr":             TARO_SR,          # 16000 (output resampled to TARGET_SR)
         "secs_per_step":  0.025,  # measured 0.023 s/step on H200
-        "load_overhead":  15,     # model load + CAVP feature extraction
         "tab_prefix":     "taro",
         "label":          "TARO",
         "regen_fn":       None,   # set after function definitions (avoids forward-ref)

         "window_s":       TARO_MODEL_DUR,  # 8.192 s
         "sr":             TARO_SR,          # 16000 (output resampled to TARGET_SR)
         "secs_per_step":  0.025,  # measured 0.023 s/step on H200
+        "load_overhead":  120,    # CAVP+onset+MMDiT+VAE+vocoder load + feature extraction
         "tab_prefix":     "taro",
         "label":          "TARO",
         "regen_fn":       None,   # set after function definitions (avoids forward-ref)