ArcFlow: Unleashing 2-Step Text-to-Image Generation via High-Precision Non-Linear Flow Distillation
Paper
•
2602.09014
•
Published
ArcFlow: Unleashing 2-Step Text-to-Image Generation via High-Precision Non-Linear Flow Distillation
Zihan Yang1,
Shuyuan Tu1,
Licheng Zhang1,
Qi Dai2,
Yu-Gang Jiang1,
Zuxuan Wu1
[1Fudan University; 2Microsoft Research Asia]
Please first install the official code repository.
We provide diffusers pipelines for easy inference. The following code demonstrates how to sample images from the distilled FLUX.2 models.
import torch
from diffusers import FlowMatchEulerDiscreteScheduler
from lakonlab.pipelines.arcqwen_pipeline import ArcQwenImagePipeline
pipe = ArcQwenImagePipeline.from_pretrained(
'Qwen/Qwen-Image',
torch_dtype=torch.bfloat16)
adapter_name = pipe.load_arcflow_adapter(
'ymyy307/ArcFlow',
subfolder='arcflow-qwen-2steps',
target_module_name='transformer')
pipe.scheduler = FlowMatchEulerDiscreteScheduler.from_config( # use fixed shift=3.2
pipe.scheduler.config, shift=3.2, shift_terminal=None, use_dynamic_shifting=False)
pipe = pipe.to('cuda')
nfe = 4
# nfe = 2
out = pipe(
prompt = 'A semi-realistic fantasy illustration featuring a split composition of two young men in profile, facing away from each other. On the left, a pale man with sharp features and slicked-back black hair wears a dark coat. On the right, a tan man with messy wavy hair wears a blue tunic. The ornate, 3D metallic gold title "Sultan\'s Game" overlays the bottom center. The background is divided into distinct sections: vibrant red abstract shapes in the upper half and deep teal textures in the lower half, creating a sharp color contrast. Painterly brushstrokes.',
num_images_per_prompt=1,
width=1024,
height=1024,
num_inference_steps=nfe,
generator=torch.Generator(device="cuda").manual_seed(42),
timestep_ratio=1.0,
).images[0]
out.save(f'arcqwen_{nfe}nfe.png')
import torch
from diffusers import FlowMatchEulerDiscreteScheduler
from lakonlab.pipelines.arcflux_pipeline import ArcFluxPipeline
pipe = ArcFluxPipeline.from_pretrained(
'black-forest-labs/FLUX.1-dev',
torch_dtype=torch.bfloat16)
adapter_name = pipe.load_arcflow_adapter( # you may later call `pipe.set_adapters([adapter_name, ...])` to combine other adapters (e.g., style LoRAs)
'ymyy307/ArcFlow',
subfolder='arcflow-flux-2steps',
target_module_name='transformer')
pipe.scheduler = FlowMatchEulerDiscreteScheduler.from_config( # use fixed shift=3.2
pipe.scheduler.config, shift=3.2, shift_terminal=None, use_dynamic_shifting=False)
pipe = pipe.to('cuda')
nfe = 4
# nfe = 2
out = pipe(
prompt = 'A portrait photo of a kangaroo wearing an orange hoodie and blue sunglasses standing in front of the Sydney Opera House holding a sign on the chest that says "Welcome Friends"',
num_images_per_prompt=1,
width=1024,
height=1024,
num_inference_steps=nfe,
generator=torch.Generator(device="cuda").manual_seed(42),
timestep_ratio=1.0,
).images[0]
out.save(f'arcflux_{nfe}nfe.png')
@misc{yang2026arcflowunleashing2steptexttoimage,
title={ArcFlow: Unleashing 2-Step Text-to-Image Generation via High-Precision Non-Linear Flow Distillation},
author={Zihan Yang and Shuyuan Tu and Licheng Zhang and Qi Dai and Yu-Gang Jiang and Zuxuan Wu},
year={2026},
eprint={2602.09014},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2602.09014},
}
Base model
Qwen/Qwen-Image