Imagination Helps Visual Reasoning, But Not Yet in Latent Space Paper • 2602.22766 • Published 15 days ago • 40
HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing Paper • 2602.03560 • Published Feb 3 • 46
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published 28 days ago • 54
Next Embedding Prediction Makes World Models Stronger Paper • 2603.02765 • Published 10 days ago • 19
NLE: Non-autoregressive LLM-based ASR by Transcript Editing Paper • 2603.08397 • Published 3 days ago • 16
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 Text Generation • 124B • Updated 1 day ago • 2.85k • 131
Weak-SIGReg: Covariance Regularization for Stable Deep Learning Paper • 2603.05924 • Published 7 days ago • 1