NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 7 items • Updated 6 days ago • 131
The End of Manual Decoding: Towards Truly End-to-End Language Models Paper • 2510.26697 • Published Oct 30, 2025 • 117
view article Article Vision Tokens vs Text Tokens: Understanding the 10× Compression Oct 22, 2025 • 6
view article Article All LLMs Write Great Code, But Some Make (A Lot) Fewer Mistakes Sep 12, 2024 • 5
SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs Paper • 2510.05069 • Published Oct 6, 2025 • 13
Symbolic Graphics Programming with Large Language Models Paper • 2509.05208 • Published Sep 5, 2025 • 47
view article Article Supercharge Edge AI With High‑Accuracy Reasoning Using NVIDIA Nemotron Nano 2 9B Aug 18, 2025 • 31
RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation Paper • 2412.11919 • Published Dec 16, 2024 • 36
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models Paper • 2410.10139 • Published Oct 14, 2024 • 51
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy +4 Sep 18, 2024 • 274
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 40 items • Updated Dec 31, 2025 • 354
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model Paper • 2409.01704 • Published Sep 3, 2024 • 83
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference Paper • 2407.14057 • Published Jul 19, 2024 • 46