Instructions to use domiso/SenseFlow with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use domiso/SenseFlow with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("domiso/SenseFlow", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Draw Things
- DiffusionBee
| license: apache-2.0 | |
| library_name: diffusers | |
| pipeline_tag: text-to-image | |
| tags: | |
| - flow-matching | |
| - distillation | |
| - flux | |
| - stable-diffusion | |
| # 🚀 SenseFlow: Scaling Distribution Matching for Flow-based Text-to-Image Distillation | |
| [](https://arxiv.org/abs/2506.00523) | |
| [](https://github.com/XingtongGe/SenseFlow) | |
| [](https://huggingface.co/domiso/SenseFlow) | |
| [Xingtong Ge](https://xingtongge.github.io/)<sup>1,2</sup>, Xin Zhang<sup>2</sup>, [Tongda Xu](https://tongdaxu.github.io/)<sup>3</sup>, [Yi Zhang](https://zhangyi-3.github.io/)<sup>4</sup>, [Xinjie Zhang](https://xinjie-q.github.io/)<sup>1</sup>, [Yan Wang](https://yanwang202199.github.io/)<sup>3</sup>, [Jun Zhang](https://eejzhang.people.ust.hk/)<sup>1</sup> | |
| <sup>1</sup>HKUST, <sup>2</sup>SenseTime Research, <sup>3</sup>Tsinghua University, <sup>4</sup>CUHK MMLab | |
| ## Abstract | |
| The Distribution Matching Distillation (DMD) has been successfully applied to text-to-image diffusion models such as Stable Diffusion (SD) 1.5. However, vanilla DMD suffers from convergence difficulties on large-scale flow-based text-to-image models, such as SD 3.5 and FLUX. In this paper, we first analyze the issues when applying vanilla DMD on large-scale models. Then, to overcome the scalability challenge, we propose implicit distribution alignment (IDA) to constrain the divergence between the generator and the fake distribution. Furthermore, we propose intra-segment guidance (ISG) to relocate the timestep denoising importance from the teacher model. With IDA alone, DMD converges for SD 3.5; employing both IDA and ISG, DMD converges for SD 3.5 and FLUX.1 dev. Together with a scaled VFM-based discriminator, our final model, dubbed **SenseFlow**, achieves superior performance in distillation for both diffusion based text-to-image models such as SDXL, and flow-matching models such as SD 3.5 Large and FLUX.1 dev. | |
| ## SenseFlow-FLUX.1 dev (supports 4–8-step generation) | |
| * `SenseFlow-FLUX/diffusion_pytorch_model.safetensors`: the DiT checkpoint. | |
| * `SenseFlow-FLUX/config.json`: the config of DiT using in our model. | |
| ### Usage | |
| 1. prepare the base checkpoint of FLUX.1 dev to `Path/to/FLUX` | |
| 2. Use `SenseFlow-FLUX` to replace the transformer folder `Path/to/FLUX/transformer`, obtaining the `Path/to/SenseFlow-FLUX`. | |
| #### Using the Euler sampler | |
| ```python | |
| import torch | |
| from diffusers import FluxPipeline | |
| from diffusers import FlowMatchEulerDiscreteScheduler | |
| pipe = FluxPipeline.from_pretrained("Path/to/SenseFlow-FLUX", torch_dtype=torch.bfloat16).to("cuda") | |
| prompt="A cat sleeping on a windowsill with white curtains fluttering in the breeze" | |
| images = pipe( | |
| prompt, | |
| height=1024, | |
| width=1024, | |
| num_inference_steps=4, | |
| max_sequence_length=512, | |
| ).images[0] | |
| images.save("output.png") | |
| ``` | |
| #### Using the x0 sampler (similar to the LCMScheduler in diffusers) | |
| ```python | |
| import torch | |
| from diffusers import FluxPipeline | |
| from diffusers import FlowMatchEulerDiscreteScheduler | |
| from typing import Union, Tuple, Optional | |
| class FlowMatchEulerX0Scheduler(FlowMatchEulerDiscreteScheduler): | |
| def step( | |
| self, | |
| model_output: torch.FloatTensor, | |
| timestep: Union[float, torch.FloatTensor], | |
| sample: torch.FloatTensor, | |
| generator: Optional[torch.Generator] = None, | |
| return_dict: bool = True, | |
| ) -> Union[FlowMatchEulerDiscreteSchedulerOutput, Tuple]: | |
| if self.step_index is None: | |
| self._init_step_index(timestep) | |
| sample = sample.to(torch.float32) # Ensure precision | |
| sigma = self.sigmas[self.step_index] | |
| sigma_next = self.sigmas[self.step_index + 1] | |
| # 1. Compute x0 from model output (assuming model predicts noise) | |
| x0 = sample - sigma * model_output | |
| # 2. Add noise to x0 to get the sample for the next step | |
| noise = torch.randn_like(sample) | |
| prev_sample = (1 - sigma_next) * x0 + sigma_next * noise | |
| prev_sample = prev_sample.to(model_output.dtype) # Convert back to original dtype | |
| self._step_index += 1 # Move to next step | |
| if not return_dict: | |
| return (prev_sample,) | |
| return FlowMatchEulerDiscreteSchedulerOutput(prev_sample=prev_sample) | |
| pipe = FluxPipeline.from_pretrained("Path/to/SenseFlow-FLUX", torch_dtype=torch.bfloat16).to("cuda") | |
| pipe.scheduler = FlowMatchEulerX0Scheduler.from_config(pipe.scheduler.config) | |
| prompt="A cat sleeping on a windowsill with white curtains fluttering in the breeze" | |
| images = pipe( | |
| prompt, | |
| height=1024, | |
| width=1024, | |
| num_inference_steps=4, | |
| max_sequence_length=512, | |
| ).images[0] | |
| images.save("output.png") | |
| ``` | |
| ## DanceGRPO-SenseFlow (supports 4–8-step generation) | |
| comming soon! | |
| ## Citation | |
| ```bibtex | |
| @article{ge2025senseflow, | |
| title={SenseFlow: Scaling Distribution Matching for Flow-based Text-to-Image Distillation}, | |
| author={Ge, Xingtong and Zhang, Xin and Xu, Tongda and Zhang, Yi and Xinjie, Zhang and Yan, Wang and Jun, Zhang}, | |
| journal={arXiv preprint arXiv:2506.00523}, | |
| year={2025} | |
| } | |
| ``` |