MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data Paper • 2603.25319 • Published 3 days ago • 26
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale Paper • 2603.25040 • Published 3 days ago • 105
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published 4 days ago • 86
UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience Paper • 2603.24533 • Published 4 days ago • 39
VGA [* Polaris Series] Collection capable of accurately locating and understanding 'any' object | State: Experimental, Category: Object Detection, High-depth analysis in visual tasks • 7 items • Updated 3 days ago • 1
view article Article SynthVision: Building a 110K Synthetic Medical VQA Dataset with Cross-Model Validation 6 days ago • 12
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation Paper • 2603.19220 • Published 10 days ago • 61
SAMA: Factorized Semantic Anchoring and Motion Alignment for Instruction-Guided Video Editing Paper • 2603.19228 • Published 10 days ago • 66
view article Article ATE-2: State-of-the-Art Armenian Text Embeddings and the ArmBench-TextEmbed Benchmark 10 days ago • 8
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence Paper • 2603.13398 • Published 18 days ago • 150
view article Article The First Healthcare Robotics Dataset and Foundational Physical AI Models for Healthcare Robotics 13 days ago • 22
Grounding World Simulation Models in a Real-World Metropolis Paper • 2603.15583 • Published 13 days ago • 148
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data Paper • 2603.15594 • Published 13 days ago • 145