VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation Paper • 2511.02778 • Published Nov 4, 2025 • 104
SEGA: Spectral-Energy Guided Attention for Resolution Extrapolation in Diffusion Transformers Paper • 2605.22668 • Published May 21 • 40
NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation Paper • 2601.02204 • Published Jan 5 • 64
Draw-In-Mind: Learning Precise Image Editing via Chain-of-Thought Imagination Paper • 2509.01986 • Published Sep 2, 2025 • 5
view article Article Fast LoRA inference for Flux with Diffusers and PEFT sayakpaul, BenjaminB • Jul 23, 2025 • 54
view article Article (LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware +3 derekl35, marcsun13, sayakpaul, merve, linoyts • Jun 19, 2025 • 107
Alchemist: Turning Public Text-to-Image Data into Generative Gold Paper • 2505.19297 • Published May 25, 2025 • 85
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT Paper • 2505.00703 • Published May 1, 2025 • 44