Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models Paper • 2606.11025 • Published 17 days ago • 41
SCAIL-2: Unifying Controlled Character Animation with End-to-end In-Context Conditioning Paper • 2606.10804 • Published 17 days ago • 49
SGMD: Score Gradient Matching Distillation for Few-Step Video Diffusion Distillation Paper • 2605.30116 • Published 29 days ago • 3
Enhancing Train-Free Infinite-Frame Generation for Consistent Long Videos Paper • 2605.18233 • Published May 18 • 93
RAVEN: Real-time Autoregressive Video Extrapolation with Consistency-model GRPO Paper • 2605.15190 • Published May 14 • 13
SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer Paper • 2605.15178 • Published May 14 • 91
Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation Paper • 2605.15141 • Published May 14 • 96
LiteFrame: Efficient Vision Encoders Unlock Frame Scaling in Video LLMs Paper • 2605.17260 • Published May 17 • 25
LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation Paper • 2605.18739 • Published May 18 • 115
Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers Paper • 2605.06169 • Published May 7 • 237
Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs Paper • 2605.09063 • Published May 9 • 82
SenseNova-U1 Collection SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-Unify Architecture • 10 items • Updated 13 days ago • 74
HP-Edit: A Human-Preference Post-Training Framework for Image Editing Paper • 2604.19406 • Published Apr 21 • 7
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published Apr 22 • 244
Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges Paper • 2604.13602 • Published Apr 15 • 32