Working Notes on Late Interaction Dynamics: Analyzing Targeted Behaviors of Late Interaction Models Paper • 2603.26259 • Published Mar 27 • 8
view article Article DeepSeek-V4: a million-token context that agents can actually use burtenshaw • 20 days ago • 44
view article Article Mixture of Experts (MoEs) in Transformers +5 ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap • Feb 26 • 159
view article Article LateOn-Code & ColGrep: LightOn unveils state-of-the-art code retrieval models and code search tooling lightonai • Feb 12 • 56
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 lysandre, ArthurZ, cyrilvallez, reach-vb • Dec 1, 2025 • 310
view article Article Continuous batching from first principles +1 ror, ArthurZ, mcpotato • Nov 25, 2025 • 379
Surfer 2: The Next Generation of Cross-Platform Computer Use Agents Paper • 2510.19949 • Published Oct 22, 2025 • 38
Holo1.5 Collection Holo1.5 - Open Foundation Models for Computer Use Agents • 5 items • Updated Sep 15, 2025 • 35
view article Article You could have designed state of the art positional encoding FL33TW00D-HF • Nov 25, 2024 • 478
view article Article SmolLM3: smol, multilingual, long-context reasoner +21 eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf • Jul 8, 2025 • 773
Holo1 Collection Vision-Language Action Model for use in Surfer-H web navigation agent • 6 items • Updated Jun 10, 2025 • 49
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch +5 ariG23498, lusxvr, andito, sergiopaniego, merve, pcuenq, reach-vb • May 21, 2025 • 258
view article Article Preference Optimization for Vision Language Models +2 qgallouedec, vwxyzjn, merve, kashif • Jul 10, 2024 • 93
view article Article Vision Language Models (Better, faster, stronger) +3 merve, sergiopaniego, ariG23498, pcuenq, andito • May 12, 2025 • 611
view article Article Gotchas in Tokenizer Behavior Every Developer Should Know qgallouedec • Apr 18, 2025 • 72
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14, 2025 • 309
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published Apr 7, 2025 • 207
view article Article ViDoRe Benchmark V2: Raising the Bar for Visual Retrieval manu • Mar 18, 2025 • 16
view article Article Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth mlabonne • Jul 29, 2024 • 371