view article Article How I contributed a new model to the Transformers library using Codex nielsr • Mar 30 • 51
Reasoning as Compression: Unifying Budget Forcing via the Conditional Information Bottleneck Paper • 2603.08462 • Published Mar 9 • 22
view article Article We Got Claude to Build CUDA Kernels and teach open models! +2 burtenshaw, evalstate, merve, pcuenq • Jan 28 • 156
view article Article Case Study: The Marcus-Thorne Mystery Cache Standoff unmodeled-tyler • Jan 1 • 3
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 18 items • Updated 9 days ago • 294
The End of Manual Decoding: Towards Truly End-to-End Language Models Paper • 2510.26697 • Published Oct 30, 2025 • 120
view article Article Vision Tokens vs Text Tokens: Understanding the 10× Compression onekq • Oct 22, 2025 • 6
view article Article All LLMs Write Great Code, But Some Make (A Lot) Fewer Mistakes onekq • Sep 12, 2024 • 5
SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs Paper • 2510.05069 • Published Oct 6, 2025 • 13
Symbolic Graphics Programming with Large Language Models Paper • 2509.05208 • Published Sep 5, 2025 • 47
view article Article Supercharge Edge AI With High‑Accuracy Reasoning Using NVIDIA Nemotron Nano 2 9B nvidia • Aug 18, 2025 • 32
RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation Paper • 2412.11919 • Published Dec 16, 2024 • 36
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models Paper • 2410.10139 • Published Oct 14, 2024 • 51
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy +4 medmekk, marcsun13, lvwerra, pcuenq, osanseviero, thomwolf • Sep 18, 2024 • 280
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 38 items • Updated Mar 2 • 368
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model Paper • 2409.01704 • Published Sep 3, 2024 • 83