view article Article NEO-unify: Building Native Multimodal Unified Models End to End sensenova • Mar 5 • 167
How and What to Imagine? Visual Thinking in Unified Multimodal Models for Cross-View Spatial Reasoning Paper • 2605.27310 • Published May 26 • 20
RiT: Vanilla Diffusion Transformers Suffice in Representation Space Paper • 2605.21981 • Published May 21 • 10
Communicating about Space: Language-Mediated Spatial Integration Across Partial Views Paper • 2603.27183 • Published Mar 28 • 20
Grounding World Simulation Models in a Real-World Metropolis Paper • 2603.15583 • Published Mar 16 • 155
EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings Paper • 2603.13594 • Published Mar 13 • 149
LatentLens: Revealing Highly Interpretable Visual Tokens in LLMs Paper • 2602.00462 • Published Jan 31 • 21