MolmoMotion: Forecasting Point Trajectories in 3D with Language Instruction Paper • 2606.18558 • Published 8 days ago • 50
Revisiting Articulated Parts Perception in Robot Manipulation Paper • 2606.08103 • Published 19 days ago • 3
VideoMDM: Towards 3D Human Motion Generation From 2D Supervision Paper • 2606.13364 • Published 14 days ago • 20
view article Article How to Fine-Tune Nemotron 3.5 ASR for Your Language, Domain, or Accent nvidia • 21 days ago • 64
AudioMosaic Collection ICML2026 AudioMosaic: Contrastive Masked Audio Representation Learning • 15 items • Updated May 10 • 3
MOSS-Audio Collection An open-source audio understanding model supporting speech recognition, environmental sound analysis, music understanding, time-aware QA, and complex • 9 items • Updated 13 days ago • 66
gliner2 family Collection GLiNER2 extends the original GLiNER architecture to support multi-task information extraction with a schema-driven interface. • 7 items • Updated May 16 • 53
CubePart: An Open-Vocabulary Part-Controllable 3D Generator Paper • 2605.28763 • Published 29 days ago • 14
GEM: Generative Supervision Helps Embodied Intelligence Paper • 2605.28548 • Published 29 days ago • 41
ControlLight: Towards Controllable, Consistent, and Generalizable Low-Light Enhancement Paper • 2605.25569 • Published May 25 • 21
Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models Paper • 2605.21573 • Published May 20 • 111
TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction Paper • 2605.26115 • Published May 25 • 52