NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 12 items • Updated about 23 hours ago • 181
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-FP8 Text Generation • 124B • Updated about 23 hours ago • 2.34k • 73
Running 55 Bringing paper to life: A modern template for scientific writing 📝 55 Explore and download a modern scientific paper template
Lost in Stories: Consistency Bugs in Long Story Generation by LLMs Paper • 2603.05890 • Published 6 days ago • 80
Running on CPU Upgrade 154 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 154 Explore synthetic data experiments in a bookshelf view
pplx-embed Collection Diffusion-Pretrained Dense and Contextual Embeddings • 7 items • Updated 14 days ago • 87
On Data Engineering for Scaling LLM Terminal Capabilities Paper • 2602.21193 • Published 16 days ago • 94
Query-focused and Memory-aware Reranker for Long Context Processing Paper • 2602.12192 • Published 28 days ago • 56
SkillOrchestra: Learning to Route Agents via Skill Transfer Paper • 2602.19672 • Published 17 days ago • 55
Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published Feb 9 • 261
Creative Writing Datasets Collection High-quality creative writing and storytelling data. • 36 items • Updated about 9 hours ago • 6
Instruction & Reasoning Collection Datasets for instruction following, code, and reasoning. • 13 items • Updated 14 days ago • 7
SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning Paper • 2602.13515 • Published 27 days ago • 43