Rethinking State Tracking in Recurrent Models Through Error Control Dynamics Paper • 2605.07755 • Published 6 days ago • 21
view article Article Adding Benchmaxxer Repellant to the Open ASR Leaderboard +9 bezzam, Steveeeeeeen, eustlb, SBruccoleriAppen, jmss-appen, c-e-ford-appen, wgb14, YukaiHuang, like2026, logicbean, ally-lxl • 8 days ago • 15
Investigating Efficiently Extending Transformers for Long Input Summarization Paper • 2208.04347 • Published Aug 8, 2022 • 1
view article Article EMO: Pretraining mixture of experts for emergent modularity allenai • 6 days ago • 31
view article Article Multimodal Embedding & Reranker Models with Sentence Transformers tomaarsen • Apr 9 • 59
OlmPool Collection Collection of models from the paper "Cracks in the Foundation: Seemingly Minor Architectural Choices Impact Long Context Extension". • 26 items • Updated 14 days ago • 3
Efficient Training on Multiple Consumer GPUs with RoundPipe Paper • 2604.27085 • Published 15 days ago • 40
Why Fine-Tuning Encourages Hallucinations and How to Fix It Paper • 2604.15574 • Published 28 days ago • 23
Olmo 3.1 Collection The latest members of the Olmo 3 family: another 3 weeks of RL for 32B Think, the 32B Instruct model, large post-training research datasets... • 9 items • Updated Dec 23, 2025 • 51
Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora Paper • 2604.24819 • Published 17 days ago • 88
Laguna XS.2 Collection Designed for agentic coding and long-horizon work on a local machine. Apache 2.0. • 5 items • Updated 7 days ago • 20
Parakeet ASR Collection NeMo Parakeet ASR Models attain strong speech recognition accuracy while being efficient for inference. Available in CTC and RNN-Transducer variants. • 16 items • Updated 5 days ago • 72
BERT-as-a-Judge: A Robust Alternative to Lexical Methods for Efficient Reference-Based LLM Evaluation Paper • 2604.09497 • Published Apr 10 • 29
Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems Paper • 2604.14228 • Published about 1 month ago • 25
Cross-Tokenizer LLM Distillation through a Byte-Level Interface Paper • 2604.07466 • Published Apr 13 • 6