Optimized ONNX models for NVIDIA RTX GPUs Collection Collection of optimized ONNX model checkpoints for NVIDIA RTX GPUs • 7 items • Updated 6 days ago • 14
NVIDIA Nemotron V2 Collection Open, Production-ready Enterprise Models. Nvidia Open Model license. • 9 items • Updated 6 days ago • 104
Nemotron RAG Collection Set of tools to build retrieval-augmented generation (RAG) systems, improve search and ranking accuracy, and extract structured data from complex docs • 10 items • Updated 6 days ago • 92
view article Article Small Yet Mighty: Improve Accuracy In Multimodal Search and Visual Document Retrieval with Llama Nemotron RAG Models nvidia • Jan 6 • 28
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated Mar 12 • 218
Gemma 3 Collection All versions of Google's new multimodal models including QAT in 1B, 4B, 12B, and 27B sizes. In GGUF, dynamic 4-bit and 16-bit formats. • 54 items • Updated 23 days ago • 115
Llama 4 Collection Meta's new Llama 4 multimodal models, Scout & Maverick. Includes Dynamic GGUFs, 16-bit & Dynamic 4-bit uploads. Run & fine-tune them with Unsloth! • 15 items • Updated 23 days ago • 56