Image Generation - a zyf515730395 Collection

Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

zyf515730395 's Collections

Image Generation

Video Generation

Image Generation

updated Feb 24

OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation

Paper • 2506.07977 • Published Jun 9, 2025 • 40
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

Paper • 2506.07986 • Published Jun 9, 2025 • 19
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis

Paper • 2506.06276 • Published Jun 6, 2025 • 26
Aligning Latent Spaces with Flow Priors

Paper • 2506.05240 • Published Jun 5, 2025 • 27
Image Editing As Programs with Diffusion Models

Paper • 2506.04158 • Published Jun 4, 2025 • 24
D-AR: Diffusion via Autoregressive Models

Paper • 2505.23660 • Published May 29, 2025 • 34
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers

Paper • 2505.23758 • Published May 29, 2025 • 22
OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data

Paper • 2505.18445 • Published May 24, 2025 • 63
DDT: Decoupled Diffusion Transformer

Paper • 2504.05741 • Published Apr 8, 2025 • 77
Step1X-Edit: A Practical Framework for General Image Editing

Paper • 2504.17761 • Published Apr 24, 2025 • 92
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group Learning

Paper • 2504.14509 • Published Apr 20, 2025 • 53
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning

Paper • 2504.07960 • Published Apr 10, 2025 • 50
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation

Paper • 2504.02160 • Published Apr 2, 2025 • 37
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published Mar 12, 2025 • 77
PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity

Paper • 2503.07677 • Published Mar 10, 2025 • 86
Seedream 2.0: A Native Chinese-English Bilingual Image Generation Foundation Model

Paper • 2503.07703 • Published Mar 10, 2025 • 37
InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity

Paper • 2503.16418 • Published Mar 20, 2025 • 36
Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8, 2025 • 88
In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer

Paper • 2504.20690 • Published Apr 29, 2025 • 19
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models

Paper • 2504.17789 • Published Apr 24, 2025 • 23
Seedream 3.0 Technical Report

Paper • 2504.11346 • Published Apr 15, 2025 • 70
GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation

Paper • 2504.08736 • Published Apr 11, 2025 • 46
PixelFlow: Pixel-Space Generative Models with Flow

Paper • 2504.07963 • Published Apr 10, 2025 • 18
SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL

Paper • 2504.11455 • Published Apr 15, 2025 • 14
D^2iT: Dynamic Diffusion Transformer for Accurate Image Generation

Paper • 2504.09454 • Published Apr 13, 2025 • 11
OminiControl: Minimal and Universal Control for Diffusion Transformer

Paper • 2411.15098 • Published Nov 22, 2024 • 61
Flow Matching for Generative Modeling

Paper • 2210.02747 • Published Oct 6, 2022 • 4
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow

Paper • 2209.03003 • Published Sep 7, 2022 • 3
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5, 2024 • 71
Align Your Flow: Scaling Continuous-Time Flow Map Distillation

Paper • 2506.14603 • Published Jun 17, 2025 • 19
OmniGen2: Exploration to Advanced Multimodal Generation

Paper • 2506.18871 • Published Jun 23, 2025 • 78
Guidance in the Frequency Domain Enables High-Fidelity Sampling at Low CFG Scales

Paper • 2506.19713 • Published Jun 24, 2025 • 13
XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation

Paper • 2506.21416 • Published Jun 26, 2025 • 28
SingLoRA: Low Rank Adaptation Using a Single Matrix

Paper • 2507.05566 • Published Jul 8, 2025 • 116
Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation

Paper • 2507.08441 • Published Jul 11, 2025 • 62
Qwen-Image Technical Report

Paper • 2508.02324 • Published Aug 4, 2025 • 274
NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale

Paper • 2508.10711 • Published Aug 14, 2025 • 146
Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation

Paper • 2508.07981 • Published Aug 11, 2025 • 63
Reinforcement Learning in Vision: A Survey

Paper • 2508.08189 • Published Aug 11, 2025 • 30
Follow-Your-Shape: Shape-Aware Image Editing via Trajectory-Guided Region Control

Paper • 2508.08134 • Published Aug 11, 2025 • 10
Next Visual Granularity Generation

Paper • 2508.12811 • Published Aug 18, 2025 • 49
S^2-Guidance: Stochastic Self Guidance for Training-Free Enhancement of Diffusion Models

Paper • 2508.12880 • Published Aug 18, 2025 • 48
MultiRef: Controllable Image Generation with Multiple Visual References

Paper • 2508.06905 • Published Aug 9, 2025 • 21
Training-Free Text-Guided Color Editing with Multi-Modal Diffusion Transformer

Paper • 2508.09131 • Published Aug 12, 2025 • 17
OmniTry: Virtual Try-On Anything without Masks

Paper • 2508.13632 • Published Aug 19, 2025 • 15
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference

Paper • 2508.02193 • Published Aug 4, 2025 • 138
Seedream 4.0: Toward Next-generation Multimodal Image Generation

Paper • 2509.20427 • Published Sep 24, 2025 • 84
DiffusionNFT: Online Diffusion Reinforcement with Forward Process

Paper • 2509.16117 • Published Sep 19, 2025 • 23
EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning

Paper • 2509.20360 • Published Sep 24, 2025 • 18
CAR-Flow: Condition-Aware Reparameterization Aligns Source and Target for Better Flow Matching

Paper • 2509.19300 • Published Sep 23, 2025 • 7
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding

Paper • 2510.06308 • Published Oct 7, 2025 • 55
Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer

Paper • 2510.06590 • Published Oct 8, 2025 • 78
Diffusion Transformers with Representation Autoencoders

Paper • 2510.11690 • Published Oct 13, 2025 • 170
Latent Diffusion Model without Variational Autoencoder

Paper • 2510.15301 • Published Oct 17, 2025 • 50
WithAnyone: Towards Controllable and ID Consistent Image Generation

Paper • 2510.14975 • Published Oct 16, 2025 • 86
Learning an Image Editing Model without Image Editing Pairs

Paper • 2510.14978 • Published Oct 16, 2025 • 9
The Principles of Diffusion Models

Paper • 2510.21890 • Published Oct 24, 2025 • 64
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation

Paper • 2510.08673 • Published Oct 9, 2025 • 127
From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model

Paper • 2510.19871 • Published Oct 22, 2025 • 30
From Editor to Dense Geometry Estimator

Paper • 2509.04338 • Published Sep 4, 2025 • 96
AToken: A Unified Tokenizer for Vision

Paper • 2509.14476 • Published Sep 17, 2025 • 37
DoPE: Denoising Rotary Position Embedding

Paper • 2511.09146 • Published Nov 12, 2025 • 98
NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation

Paper • 2601.02204 • Published Jan 5 • 63
ThinkRL-Edit: Thinking in Reinforcement Learning for Reasoning-Centric Image Editing

Paper • 2601.03467 • Published Jan 6 • 7
SpotEdit: Selective Region Editing in Diffusion Transformers

Paper • 2512.22323 • Published Dec 26, 2025 • 39
DreamOmni3: Scribble-based Editing and Generation

Paper • 2512.22525 • Published Dec 27, 2025 • 15
LongCat-Flash-Thinking-2601 Technical Report

Paper • 2601.16725 • Published Jan 23 • 180
FireRed-Image-Edit-1.0 Techinical Report

Paper • 2602.13344 • Published Feb 12 • 8
DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing

Paper • 2602.12205 • Published Feb 12 • 83
UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing

Paper • 2602.02437 • Published Feb 2 • 80

Collection guide
Browse collections

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs