Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

1,899

Base only

Active filters: nvidia

Sai003/llama-70-b

Text Generation • Updated Nov 7, 2024

Sai003/Llama-3.1-70B

Text Generation • 71B • Updated Nov 8, 2024 • 5

bartowski/Llama-3.1-Nemotron-70B-Instruct-HF-abliterated-GGUF

Text Generation • 71B • Updated Nov 8, 2024 • 5.13k • 6

mradermacher/OpenMath-CodeLlama-70b-Python-hf-GGUF

69B • Updated Aug 1, 2025 • 46

tensorblock/Llama3-ChatQA-1.5-8B-GGUF

Text Generation • 8B • Updated Jan 27 • 13

mradermacher/OpenMath-Llama-2-70b-hf-GGUF

69B • Updated Aug 1, 2025 • 4

mradermacher/OpenMath-CodeLlama-70b-Python-hf-i1-GGUF

69B • Updated Nov 11, 2024 • 70

mradermacher/OpenMath-Llama-2-70b-hf-i1-GGUF

69B • Updated Nov 11, 2024 • 83

elitexp/Llama-3.1-Nemotron-70B-Instruct-HF-Q4-mlx

Text Generation • 11B • Updated Nov 13, 2024 • 6

GainEnergy/OGAI-3.1-Engineer

Text Classification • Updated Nov 14, 2024 • 2

Jellon/Llama-3.1-Nemotron-70B-Instruct-HF-abliterated-exl2-3bpw

Text Generation • Updated Nov 15, 2024 • 3

mradermacher/Nemotron-Mini-4B-Instruct-GGUF

4B • Updated Feb 15, 2025 • 46

mradermacher/Nemotron-Mini-4B-Instruct-i1-GGUF

4B • Updated Feb 15, 2025 • 136

Model-SafeTensors/Llama-3.1-Nemotron-70B-Instruct-HF

Text Generation • 71B • Updated Oct 25, 2024 • 3

Raj-Maharajwala/Open-Insurance-LLM-Llama3-8B

Text Generation • 8B • Updated Nov 24, 2024 • 435 • • 4

Raj-Maharajwala/Open-Insurance-LLM-Llama3-8B-GGUF

Text Generation • 8B • Updated Apr 5, 2025 • 55 • 5

bartowski/Open-Insurance-LLM-Llama3-8B-GGUF

Text Generation • 8B • Updated Nov 26, 2024 • 469 • 6

tensorblock/Llama3-ChatQA-2-8B-GGUF

Text Generation • 8B • Updated Jan 27 • 5

prithivMLmods/OpenMath-8B-GGUF

Text Generation • 8B • Updated Nov 30, 2024 • 47 • 1

QuantFactory/Open-Insurance-LLM-Llama3-8B-GGUF

Text Generation • 8B • Updated Dec 2, 2024 • 129 • 6

Dracones/Llama-3.1-Nemotron-70B-Instruct_exl2_8.0bpw

Text Generation • Updated Dec 2, 2024 • 3

Dracones/Llama-3.1-Nemotron-70B-Instruct_exl2_7.0bpw

Text Generation • Updated Dec 2, 2024 • 3

Dracones/Llama-3.1-Nemotron-70B-Instruct_exl2_6.0bpw

Text Generation • Updated Dec 2, 2024 • 2

Dracones/Llama-3.1-Nemotron-70B-Instruct_exl2_5.0bpw

Text Generation • Updated Dec 2, 2024 • 5

Dracones/Llama-3.1-Nemotron-70B-Instruct_exl2_4.5bpw

Text Generation • Updated Dec 2, 2024 • 4

Dracones/Llama-3.1-Nemotron-70B-Instruct_exl2_4.0bpw

Text Generation • Updated Dec 2, 2024 • 5

Dracones/Llama-3.1-Nemotron-70B-Instruct_exl2_3.5bpw

Text Generation • Updated Dec 2, 2024 • 5

Dracones/Llama-3.1-Nemotron-70B-Instruct_exl2_3.0bpw

Text Generation • Updated Dec 2, 2024 • 3

tensorblock/Llama-3.1-Nemotron-70B-Instruct-HF-bf16-GGUF

Text Generation • 71B • Updated Jan 27 • 35

ymcki/Llama-3_1-Nemotron-51B-Instruct-GGUF

Text Generation • Updated May 4, 2025 • 179 • 14