vision
updated
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D
Worlds from Words or Pixels
Paper
• 2507.21809
• Published
• 140
OmniPart: Part-Aware 3D Generation with Semantic Decoupling and
Structural Cohesion
Paper
• 2507.06165
• Published
• 60
Paper
• 2508.10104
• Published
• 298
Qwen-Image Technical Report
Paper
• 2508.02324
• Published
• 272
Visual-CoG: Stage-Aware Reinforcement Learning with Chain of Guidance
for Text-to-Image Generation
Paper
• 2508.18032
• Published
• 41
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion
Transformers
Paper
• 2410.10629
• Published
• 12
Masked Autoencoders Are Effective Tokenizers for Diffusion Models
Paper
• 2502.03444
• Published
Seedream 3.0 Technical Report
Paper
• 2504.11346
• Published
• 70
DanceGRPO: Unleashing GRPO on Visual Generation
Paper
• 2505.07818
• Published
• 32
UMO: Scaling Multi-Identity Consistency for Image Customization via
Matching Reward
Paper
• 2509.06818
• Published
• 29
Instruct-Imagen: Image Generation with Multi-modal Instruction
Paper
• 2401.01952
• Published
• 32
Kontinuous Kontext: Continuous Strength Control for Instruction-based
Image Editing
Paper
• 2510.08532
• Published
• 6
Diffusion Transformers with Representation Autoencoders
Paper
• 2510.11690
• Published
• 168