arxiv:2603.07815

HybridStitch: Pixel and Timestep Level Model Stitching for Diffusion Acceleration

Published on Mar 8

· Submitted by

Jintao Zhang on Mar 16

University of California, Berkeley

Upvote

Authors:

Desen Sun ,

Jason Hon ,

Jintao Zhang ,

Abstract

HybridStitch enables faster text-to-image generation by combining large and small models to efficiently handle different image complexity regions.

AI-generated summary

Diffusion models have demonstrated a remarkable ability in Text-to-Image (T2I) generation applications. Despite the advanced generation output, they suffer from heavy computation overhead, especially for large models that contain tens of billions of parameters. Prior work has illustrated that replacing part of the denoising steps with a smaller model still maintains the generation quality. However, these methods only focus on saving computation for some timesteps, ignoring the difference in compute demand within one timestep. In this work, we propose HybridStitch, a new T2I generation paradigm that treats generation like editing. Specifically, we introduce a hybrid stage that jointly incorporates both the large model and the small model. HybridStitch separates the entire image into two regions: one that is relatively easy to render, enabling an early transition to the smaller model, and another that is more complex and therefore requires refinement by the large model. HybridStitch employs the small model to construct a coarse sketch while exploiting the large model to edit and refine the complex regions. According to our evaluation, HybridStitch achieves 1.83times speedup on Stable Diffusion 3, which is faster than all existing mixture of model methods.

View arXiv page View PDF Add to collection

Community

jt-zhang

Paper author Paper submitter about 13 hours ago

HybridStitch separates the entire image into two regions: one that is relatively easy to render, enabling an early transition to the smaller model, and another that is more complex and therefore requires refinement by the large model.HybridStitch employs the small model to construct a coarse sketch while exploiting the large model to edit and refine the complex regions. HybridStitch achieves ~2 speedup in diffusion model inference.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2603.07815 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2603.07815 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.07815 in a Space README.md to link it from this page.