Even with AI, Bijection Discovery is Still Hard: The Opportunities and Challenges of OpenEvolve for Novel Bijection Construction Paper • 2511.20987 • Published Nov 26, 2025 • 1
GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms Paper • 2511.17592 • Published Nov 17, 2025 • 122
KVarN: Variance-Normalized KV-Cache Quantization Mitigates Error Accumulation in Reasoning Tasks Paper • 2606.03458 • Published 21 days ago • 65
Efficient Training on Multiple Consumer GPUs with RoundPipe Paper • 2604.27085 • Published Apr 29 • 47
Nemotron-Post-Training-v3 Collection Collection of datasets used in the post-training phase of Nemotron Nano, Super, and Ultra v3. • 50 items • Updated 11 days ago • 161
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift • Apr 2 • 909
Nano Language Models Collection A collection of really small language models pre-trained from scratch with open-data. Ideal for use in experimentation and evaluations. • 3 items • Updated Mar 25 • 1
view article Article Scaling Pedagogical Pre-training: From Optimal Mixing to 10 Billion Tokens codelion • Mar 6 • 5
🤏 Smol-Data Collection Tried and tested mixes for strong pretraining. Inspired by https://huggingface.co/blog/codelion/optimal-dataset-mixing • 14 items • Updated Mar 2 • 12
PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published Jan 30 • 229
view article Article Reverse Engineering a $500M Mystery: From HashHop to Memory-Augmented Language Models codelion • Jan 23 • 10
view article Article The Optimal Architecture for Small Language Models codelion • Dec 26, 2025 • 121
view article Article Ellora: Enhancing LLMs with LoRA - Standardized Recipes for Capability Enhancement codelion • Dec 3, 2025 • 15
Budget-Aware Tool-Use Enables Effective Agent Scaling Paper • 2511.17006 • Published Nov 21, 2025 • 34
view article Article The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix codelion • Nov 3, 2025 • 65