From Context to Skills: Can Language Models Learn from Context Skillfully? Paper • 2604.27660 • Published 7 days ago • 145
EditCrafter: Tuning-free High-Resolution Image Editing via Pretrained Diffusion Model Paper • 2604.10268 • Published 29 days ago • 12
view article Article Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents 11 days ago • 45
World-R1: Reinforcing 3D Constraints for Text-to-Video Generation Paper • 2604.24764 • Published 13 days ago • 116
view article Article Training and Finetuning Multimodal Embedding & Reranker Models with Sentence Transformers 24 days ago • 69
FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling Paper • 2604.06916 • Published Apr 8 • 34
Omni-SimpleMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory Paper • 2604.01007 • Published Apr 2 • 31
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale Paper • 2603.25040 • Published Mar 26 • 132
UniGRPO: Unified Policy Optimization for Reasoning-Driven Visual Generation Paper • 2603.23500 • Published Mar 24 • 36
Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing Paper • 2603.03143 • Published Mar 3 • 145
CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video Paper • 2603.04291 • Published Mar 4 • 14
CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation Paper • 2602.24286 • Published Feb 27 • 98
Dr. MAS: Stable Reinforcement Learning for Multi-Agent LLM Systems Paper • 2602.08847 • Published Feb 9 • 29
When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning Paper • 2602.08236 • Published Feb 9 • 9