view article Article How we OCR'ed 30,000 papers using Codex, open OCR models and Jobs nielsr • Apr 7 • 61
Intern-S1: A Scientific Multimodal Foundation Model Paper • 2508.15763 • Published Aug 21, 2025 • 273
Where Did This Sentence Come From? Tracing Provenance in LLM Reasoning Distillation Paper • 2512.20908 • Published Dec 24, 2025 • 29
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning Paper • 2601.09088 • Published Jan 14 • 63
LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training Paper • 2509.23661 • Published Sep 28, 2025 • 49
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B Paper • 2511.06221 • Published Nov 9, 2025 • 134
Scaling Latent Reasoning via Looped Language Models Paper • 2510.25741 • Published Oct 29, 2025 • 229
SoulX-Podcast: Towards Realistic Long-form Podcasts with Dialectal and Paralinguistic Diversity Paper • 2510.23541 • Published Oct 27, 2025 • 17
view article Article Finally, a Replacement for BERT: Introducing ModernBERT +13 bwarner, NohTow, bclavie, orionweller, ohallstrom, staghado, alexisgallagher, rbiswasfc, fladhak, tomaarsen, ncoop57, griffin, jph00, johnowhitaker, iacolippo • Dec 19, 2024 • 740
Cache-to-Cache: Direct Semantic Communication Between Large Language Models Paper • 2510.03215 • Published Oct 3, 2025 • 99
Artificial Hippocampus Networks for Efficient Long-Context Modeling Paper • 2510.07318 • Published Oct 8, 2025 • 32
The African Languages Lab: A Collaborative Approach to Advancing Low-Resource African NLP Paper • 2510.05644 • Published Oct 7, 2025 • 25
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6, 2025 • 192
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published Jan 14, 2025 • 303
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27, 2024 • 629
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6, 2024 • 190