OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation Paper • 2506.07977 • Published Jun 9, 2025 • 41
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers Paper • 2506.07986 • Published Jun 9, 2025 • 19
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis Paper • 2506.06276 • Published Jun 6, 2025 • 26
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers Paper • 2505.23758 • Published May 29, 2025 • 22
OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data Paper • 2505.18445 • Published May 24, 2025 • 63
Step1X-Edit: A Practical Framework for General Image Editing Paper • 2504.17761 • Published Apr 24, 2025 • 92
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group Learning Paper • 2504.14509 • Published Apr 20, 2025 • 53
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning Paper • 2504.07960 • Published Apr 10, 2025 • 50
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation Paper • 2504.02160 • Published Apr 2, 2025 • 37
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published Mar 12, 2025 • 75
PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity Paper • 2503.07677 • Published Mar 10, 2025 • 86
Seedream 2.0: A Native Chinese-English Bilingual Image Generation Foundation Model Paper • 2503.07703 • Published Mar 10, 2025 • 37
InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity Paper • 2503.16418 • Published Mar 20, 2025 • 36
Flow-GRPO: Training Flow Matching Models via Online RL Paper • 2505.05470 • Published May 8, 2025 • 87
In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer Paper • 2504.20690 • Published Apr 29, 2025 • 19
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models Paper • 2504.17789 • Published Apr 24, 2025 • 23
GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation Paper • 2504.08736 • Published Apr 11, 2025 • 46
SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL Paper • 2504.11455 • Published Apr 15, 2025 • 14
D^2iT: Dynamic Diffusion Transformer for Accurate Image Generation Paper • 2504.09454 • Published Apr 13, 2025 • 11
OminiControl: Minimal and Universal Control for Diffusion Transformer Paper • 2411.15098 • Published Nov 22, 2024 • 61
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow Paper • 2209.03003 • Published Sep 7, 2022 • 2
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis Paper • 2403.03206 • Published Mar 5, 2024 • 71
Align Your Flow: Scaling Continuous-Time Flow Map Distillation Paper • 2506.14603 • Published Jun 17, 2025 • 19
OmniGen2: Exploration to Advanced Multimodal Generation Paper • 2506.18871 • Published Jun 23, 2025 • 78
Guidance in the Frequency Domain Enables High-Fidelity Sampling at Low CFG Scales Paper • 2506.19713 • Published Jun 24, 2025 • 14
XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation Paper • 2506.21416 • Published Jun 26, 2025 • 28
Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation Paper • 2507.08441 • Published Jul 11, 2025 • 62
NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale Paper • 2508.10711 • Published Aug 14, 2025 • 145
Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation Paper • 2508.07981 • Published Aug 11, 2025 • 63
Follow-Your-Shape: Shape-Aware Image Editing via Trajectory-Guided Region Control Paper • 2508.08134 • Published Aug 11, 2025 • 10
S^2-Guidance: Stochastic Self Guidance for Training-Free Enhancement of Diffusion Models Paper • 2508.12880 • Published Aug 18, 2025 • 48
MultiRef: Controllable Image Generation with Multiple Visual References Paper • 2508.06905 • Published Aug 9, 2025 • 21
Training-Free Text-Guided Color Editing with Multi-Modal Diffusion Transformer Paper • 2508.09131 • Published Aug 12, 2025 • 16
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference Paper • 2508.02193 • Published Aug 4, 2025 • 136
Seedream 4.0: Toward Next-generation Multimodal Image Generation Paper • 2509.20427 • Published Sep 24, 2025 • 82
DiffusionNFT: Online Diffusion Reinforcement with Forward Process Paper • 2509.16117 • Published Sep 19, 2025 • 22
EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning Paper • 2509.20360 • Published Sep 24, 2025 • 18
CAR-Flow: Condition-Aware Reparameterization Aligns Source and Target for Better Flow Matching Paper • 2509.19300 • Published Sep 23, 2025 • 7
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Paper • 2510.06308 • Published Oct 7, 2025 • 55
Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer Paper • 2510.06590 • Published Oct 8, 2025 • 76
Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published Oct 13, 2025 • 166
Latent Diffusion Model without Variational Autoencoder Paper • 2510.15301 • Published Oct 17, 2025 • 49
WithAnyone: Towards Controllable and ID Consistent Image Generation Paper • 2510.14975 • Published Oct 16, 2025 • 85
Learning an Image Editing Model without Image Editing Pairs Paper • 2510.14978 • Published Oct 16, 2025 • 9
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation Paper • 2510.08673 • Published Oct 9, 2025 • 126
From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model Paper • 2510.19871 • Published Oct 22, 2025 • 30
NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation Paper • 2601.02204 • Published Jan 5 • 62
ThinkRL-Edit: Thinking in Reinforcement Learning for Reasoning-Centric Image Editing Paper • 2601.03467 • Published Jan 6 • 7
SpotEdit: Selective Region Editing in Diffusion Transformers Paper • 2512.22323 • Published Dec 26, 2025 • 39