Training Language Models via Neural Cellular Automata Paper • 2603.10055 • Published 4 days ago • 3
Coarse-Guided Visual Generation via Weighted h-Transform Sampling Paper • 2603.12057 • Published 1 day ago • 4
Understanding by Reconstruction: Reversing the Software Development Process for LLM Pretraining Paper • 2603.11103 • Published 2 days ago • 5
Video-Based Reward Modeling for Computer-Use Agents Paper • 2603.10178 • Published 3 days ago • 28
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections Paper • 2603.12180 • Published 1 day ago • 38
Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training Paper • 2603.12255 • Published about 24 hours ago • 62
According to Me: Long-Term Personalized Referential Memory QA Paper • 2603.01990 • Published 11 days ago • 3
MA-EgoQA: Question Answering over Egocentric Videos from Multiple Embodied Agents Paper • 2603.09827 • Published 3 days ago • 26
LLM2Vec-Gen: Generative Embeddings from Large Language Models Paper • 2603.10913 • Published 2 days ago • 27
Flash-KMeans: Fast and Memory-Efficient Exact K-Means Paper • 2603.09229 • Published 4 days ago • 59
FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling Paper • 2603.06199 • Published 7 days ago • 9
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders Paper • 2603.06569 • Published 7 days ago • 102
STMI: Segmentation-Guided Token Modulation with Cross-Modal Hypergraph Interaction for Multi-Modal Object Re-Identification Paper • 2603.00695 • Published 13 days ago • 3
KARL: Knowledge Agents via Reinforcement Learning Paper • 2603.05218 • Published 8 days ago • 6
DreamWorld: Unified World Modeling in Video Generation Paper • 2603.00466 • Published 14 days ago • 16
MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier Paper • 2603.03756 • Published 9 days ago • 86
Helios: Real Real-Time Long Video Generation Model Paper • 2603.04379 • Published 9 days ago • 161
ArtHOI: Articulated Human-Object Interaction Synthesis by 4D Reconstruction from Video Priors Paper • 2603.04338 • Published 9 days ago • 21