NormGuard: Reward-Preserving Norm Constraints in Flow-Matching Reinforcement Learning Paper • 2606.27771 • Published 4 days ago • 3
view article Article NEO-unify: Building Native Multimodal Unified Models End to End sensenova • Mar 5 • 167
V-JEPA 2 Collection A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann • 8 items • Updated Jun 13, 2025 • 225
InterleaveThinker: Reinforcing Agentic Interleaved Generation Paper • 2606.13679 • Published 19 days ago • 82
From Pixels to Words -- Towards Native One-Vision Models at Scale Paper • 2605.28820 • Published May 27 • 75
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture Paper • 2605.12500 • Published May 12 • 194
SenseNova-U1 Collection SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-Unify Architecture • 10 items • Updated 17 days ago • 74
Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation Paper • 2604.24763 • Published Apr 27 • 71
WorldMark: A Unified Benchmark Suite for Interactive Video World Models Paper • 2604.21686 • Published Apr 23 • 36 • 3
Prompt Relay: Inference-Time Temporal Control for Multi-Event Video Generation Paper • 2604.10030 • Published Apr 11 • 15
MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding Paper • 2603.22458 • Published Mar 23 • 138
LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning Paper • 2603.21065 • Published Mar 22 • 78