MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU Paper • 2604.05091 • Published Apr 6 • 46
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 18 items • Updated 4 days ago • 294
pplx-embed Collection Diffusion-Pretrained Dense and Contextual Embeddings • 9 items • Updated 3 days ago • 98
Embeddings datasets ⚡️ Collection This collection gather datasets for embeddings pre-training and fine-tuning. • 19 items • Updated Apr 7 • 5
SWE-bench Collection SWE-bench is a benchmark for evaluating Language Models and AI Systems on their ability resolve real world GitHub Issues. • 4 items • Updated Mar 8, 2025 • 10
MMFineReason Collection High-quality STEM reasoning dataset for Multimodal LLM post-training. • 8 items • Updated 17 days ago • 24
view article Article The Optimal Architecture for Small Language Models codelion • Dec 26, 2025 • 121
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning Paper • 2512.07461 • Published Dec 8, 2025 • 80
Ministral 3 Collection A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated Dec 2, 2025 • 167