Revert to shipped scene-emb .pt; pre-warm pipeline blocks for ZeroGPU

#1
by linoyts HF Staff - opened

Two changes:

  1. Drop Gemma from inference. Re-download the shipped comfyui_models_loras_ltxv_ltx2_ltx-2.3-22b-ic-lora-hdr-scene-emb.pt from diffusers-internal-dev/LTX-HDR-LoRA and hand it directly to HDRICLoraPipeline. Removes the prompt textbox and the per-call ~20s Gemma 12B load+encode. Matches the HDR IC-LoRA's training-time scene embedding.

  2. Pre-warm the pipeline at module load. Build the fp8-cast LoRA-fused transformer once (shared between stage_1 and stage_2), plus ImageConditioner encoder, VideoUpsampler (encoder + upsampler), and VideoDecoder. Replace the pipeline's blocks with cached wrappers that reuse the built models without the gpu_model() meta-device free on exit. Avoids re-reading the 22B checkpoint + LoRA fusion + fp8 cast on every @spaces.GPU call.

Tradeoffs: startup takes longer (one build of every component), but each subsequent generation skips the ~30-60s rebuild.

linoyts changed pull request status to merged

Sign up or log in to comment