Generation Enhances Understanding in Unified Multimodal Models via Multi-Representation Generation Paper • 2601.21406 • Published 1 day ago • 3
Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models Paper • 2601.20354 • Published 2 days ago • 96
DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation Paper • 2601.22153 • Published 1 day ago • 42
Less is More: Optimizing Function Calling for LLM Execution on Edge Devices Paper • 2411.15399 • Published Nov 23, 2024 • 1
view article Article Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective 4 days ago • 37
iFSQ: Improving FSQ for Image Generation with 1 Line of Code Paper • 2601.17124 • Published 7 days ago • 30