GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents Paper • 2604.07429 • Published 7 days ago • 14
ViVa: A Video-Generative Value Model for Robot Reinforcement Learning Paper • 2604.08168 • Published 6 days ago • 17
Small Vision-Language Models are Smart Compressors for Long Video Understanding Paper • 2604.08120 • Published 6 days ago • 18
FIT: A Large-Scale Dataset for Fit-Aware Virtual Try-On Paper • 2604.08526 • Published 6 days ago • 20
MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping Paper • 2604.08364 • Published 6 days ago • 94