Imagination Helps Visual Reasoning, But Not Yet in Latent Space Paper • 2602.22766 • Published 16 days ago • 41
HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing Paper • 2602.03560 • Published Feb 3 • 46
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published 29 days ago • 54
Next Embedding Prediction Makes World Models Stronger Paper • 2603.02765 • Published 11 days ago • 19
NLE: Non-autoregressive LLM-based ASR by Transcript Editing Paper • 2603.08397 • Published 4 days ago • 17
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 Text Generation • 124B • Updated about 2 hours ago • 6.44k • 167
Weak-SIGReg: Covariance Regularization for Stable Deep Learning Paper • 2603.05924 • Published 8 days ago • 1