DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing Paper • 2602.12205 • Published 11 days ago • 78
MOVA: Towards Scalable and Synchronized Video-Audio Generation Paper • 2602.08794 • Published 14 days ago • 152
UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing Paper • 2602.02437 • Published 21 days ago • 76
G^2RPO: Granular GRPO for Precise Reward in Flow Models Paper • 2510.01982 • Published Oct 2, 2025 • 7
AdaTooler-V: Adaptive Tool-Use for Images and Videos Paper • 2512.16918 • Published Dec 18, 2025 • 14
StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors Paper • 2512.16915 • Published Dec 18, 2025 • 38
Skyra: AI-Generated Video Detection via Grounded Artifact Reasoning Paper • 2512.15693 • Published Dec 17, 2025 • 18
V-RGBX: Video Editing with Accurate Controls over Intrinsic Properties Paper • 2512.11799 • Published Dec 12, 2025 • 30
When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought Paper • 2511.02779 • Published Nov 4, 2025 • 59
UniREditBench: A Unified Reasoning-based Image Editing Benchmark Paper • 2511.01295 • Published Nov 3, 2025 • 39
STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence Paper • 2510.24693 • Published Oct 28, 2025 • 19
SPARK: Synergistic Policy And Reward Co-Evolving Framework Paper • 2509.22624 • Published Sep 26, 2025 • 19
CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning Paper • 2509.22647 • Published Sep 26, 2025 • 33
CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning Paper • 2508.20096 • Published Aug 27, 2025 • 37
Intern-S1: A Scientific Multimodal Foundation Model Paper • 2508.15763 • Published Aug 21, 2025 • 269
SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience Paper • 2508.04700 • Published Aug 6, 2025 • 52
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference Paper • 2508.02193 • Published Aug 4, 2025 • 136
Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models Paper • 2508.00819 • Published Aug 1, 2025 • 63
SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction Paper • 2507.15852 • Published Jul 21, 2025 • 38