Migrate to ZeroGPU + Gradio 6 with bug fixes

#6
by hysts HF Staff - opened

Hi @pq-yang, I've been working on migrating this Space to ZeroGPU. Along the way I ran into several bugs in the existing code that needed fixing first, so this PR ended up larger than expected โ€” but the changes are all necessary for ZeroGPU to work correctly.

Once this is reviewed and merged, I'll switch the hardware grant from the current L4 GPU to ZeroGPU, which is more cost-efficient and better for the community.

Below is a categorized summary to make review easier.

ZeroGPU support

  • Add import spaces and @spaces.GPU decorators to the three GPU-bound functions:
    • _sam_refine_gpu (duration=30) โ€” new GPU worker split out from sam_refine
    • image_matting (duration=60)
    • video_matting (duration=120)
  • Split sam_refine into a CPU-side handler (extracts evt.index, updates state) + a GPU worker (runs SAM inference). This is required because ZeroGPU cannot serialize Gradio event objects.
  • Remove set_image() calls from select_video_template / select_image_template โ€” the GPU worker now handles reset_image() + set_image() on each invocation.
  • Eagerly load all models at startup. Lazy loading inside @spaces.GPU functions causes issues with ZeroGPU's allocation lifecycle.

Gradio 4 โ†’ 6 migration

  • Update sdk_version to 6.9.0 and python_version to 3.12.12 in README frontmatter.
  • Remove broken Svelte-internal CSS selectors (.svelte-lcpz3o etc.) that no longer apply in Gradio 6.
  • Remove the video_input.change event handler (redundant with .clear in Gradio 6).

Bug fixes

  • Resize condition typo: image_size[0]>=1080 and image_size[0]>=1080 โ€” the second check compared height twice instead of checking width. Fixed to image_size[1]>=1080.
  • Silent mask corruption: When the template mask was empty (all zeros), the old code silently set template_mask[0][0]=1, producing garbage output. Now raises gr.Error to tell the user to set a mask first.
  • State desync in get_frames_from_video: The function was not returning interactive_state or click_state, so stale state from a previous session could leak into the next one. Now returns freshly initialized state.
  • Missing click_state output in sam_refine: The function mutated click_state but didn't return it, causing clicks to be lost. Added it to the outputs.
  • Image input missing resize: High-resolution images (>1080p) were not resized on the image tab, unlike the video tab. Added the same resize logic.

Stability improvements

  • Add MAX_FRAMES env var (default: 200) to cap extracted frames and prevent OOM on long videos.
  • Raise gr.Error when zero frames are read from a video.
  • Write video outputs to tempfile.mkdtemp() instead of ./results/ to prevent concurrent users from overwriting each other's output files.
  • Replace hardcoded /home/user/app/ paths with _HERE-based relative paths.
hysts changed pull request status to open

Thanks a lot for the contribution and for the very clear breakdown of the changes, which makes the review much easier.

I've gone through the PR and tested it on my side. The ZeroGPU migration and the bug fixes look good to me.

I'll go ahead and merge this. Thanks again for the great work, and also for helping switch the Space hardware to ZeroGPU!

PeiqingYang changed pull request status to merged

Sign up or log in to comment