Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

1,922

Base only

Active filters: nvidia

Raj-Maharajwala/Open-Insurance-LLM-Llama3-8B

Text Generation • 8B • Updated Nov 24, 2024 • 460 • • 4

Raj-Maharajwala/Open-Insurance-LLM-Llama3-8B-GGUF

Text Generation • 8B • Updated Apr 5, 2025 • 59 • 5

bartowski/Open-Insurance-LLM-Llama3-8B-GGUF

Text Generation • 8B • Updated Nov 26, 2024 • 506 • 6

tensorblock/Llama3-ChatQA-2-8B-GGUF

Text Generation • 8B • Updated Jan 27 • 6

prithivMLmods/OpenMath-8B-GGUF

Text Generation • 8B • Updated Nov 30, 2024 • 27 • 1

QuantFactory/Open-Insurance-LLM-Llama3-8B-GGUF

Text Generation • 8B • Updated Dec 2, 2024 • 136 • 6

Dracones/Llama-3.1-Nemotron-70B-Instruct_exl2_8.0bpw

Text Generation • Updated Dec 2, 2024 • 4

Dracones/Llama-3.1-Nemotron-70B-Instruct_exl2_7.0bpw

Text Generation • Updated Dec 2, 2024 • 6

Dracones/Llama-3.1-Nemotron-70B-Instruct_exl2_6.0bpw

Text Generation • Updated Dec 2, 2024 • 3

Dracones/Llama-3.1-Nemotron-70B-Instruct_exl2_5.0bpw

Text Generation • Updated Dec 2, 2024 • 6

Dracones/Llama-3.1-Nemotron-70B-Instruct_exl2_4.5bpw

Text Generation • Updated Dec 2, 2024 • 2

Dracones/Llama-3.1-Nemotron-70B-Instruct_exl2_4.0bpw

Text Generation • Updated Dec 2, 2024 • 6

Dracones/Llama-3.1-Nemotron-70B-Instruct_exl2_3.5bpw

Text Generation • Updated Dec 2, 2024 • 6

Dracones/Llama-3.1-Nemotron-70B-Instruct_exl2_3.0bpw

Text Generation • Updated Dec 2, 2024 • 4

tensorblock/Llama-3.1-Nemotron-70B-Instruct-HF-bf16-GGUF

Text Generation • 71B • Updated Jan 27 • 28

ymcki/Llama-3_1-Nemotron-51B-Instruct-GGUF

Text Generation • Updated May 4, 2025 • 217 • 14

tensorblock/Llama-3.1-Nemotron-70B-Instruct-GGUF

Text Generation • 71B • Updated Jan 27 • 8

KnutJaegersberg/Llama3-ChatQA-2-70B-4.65bpw-exl2

Text Generation • Updated Dec 8, 2024 • 1

tensorblock/Llama3-ChatQA-1.5-70B-GGUF

Text Generation • 71B • Updated Jan 27 • 10

sandbox-ai/Llama-3.1-Tango-70b-bnb_4b

Text Generation • 73B • Updated Jan 3, 2025 • 3 • 4

lfb0424/model-test

Text Generation • Updated Dec 17, 2024

jeorjesami/NividiaLatestModel

Text Generation • Updated Dec 12, 2024 • 1

jacobcarajo/OpenMath2-Llama3.1-8B-Q5_K_M-GGUF

8B • Updated Dec 14, 2024 • 4

cnfusion/Llama-3.1-Nemotron-70B-Instruct-HF-Q2-mlx

Text Generation • 7B • Updated Dec 16, 2024 • 13

nvidia/NVLM-D-72B-mcore

Image-Text-to-Text • Updated Jan 14, 2025 • 7

nvidia/Llama-2-7B-DMC-4x

Updated Dec 22, 2024 • 3

nvidia/Llama-2-7B-DMC-8x

Updated Dec 22, 2024 • 3

nvidia/Llama-2-13B-DMC-4x

Updated Dec 22, 2024 • 2

nvidia/Llama-2-13B-DMC-8x

Updated Dec 22, 2024 • 3

backyardai/Llama-3.1-Nemotron-70B-Instruct-GGUF

Text Generation • 71B • Updated Dec 22, 2024 • 108