NormGuard: Reward-Preserving Norm Constraints in Flow-Matching Reinforcement Learning Paper • 2606.27771 • Published 4 days ago • 3
view article Article NEO-unify: Building Native Multimodal Unified Models End to End sensenova • Mar 5 • 167
V-JEPA 2 Collection A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann • 8 items • Updated Jun 13, 2025 • 225
InterleaveThinker: Reinforcing Agentic Interleaved Generation Paper • 2606.13679 • Published 19 days ago • 82
From Pixels to Words -- Towards Native One-Vision Models at Scale Paper • 2605.28820 • Published May 27 • 75
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture Paper • 2605.12500 • Published May 12 • 194
Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation Paper • 2604.24763 • Published Apr 27 • 71
Prompt Relay: Inference-Time Temporal Control for Multi-Event Video Generation Paper • 2604.10030 • Published Apr 11 • 15
MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding Paper • 2603.22458 • Published Mar 23 • 138
LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning Paper • 2603.21065 • Published Mar 22 • 78
SafeRBench: A Comprehensive Benchmark for Safety Assessment in Large Reasoning Models Paper • 2511.15169 • Published Nov 19, 2025 • 1
LongVie 2: Multimodal Controllable Ultra-Long Video World Model Paper • 2512.13604 • Published Dec 15, 2025 • 76
JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence Paper • 2510.23538 • Published Oct 27, 2025 • 99