Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

128

Base only

Active filters: tool use

NousResearch/Hermes-4.3-36B

Text Generation • 36B • Updated Dec 6, 2025 • 7.33k • 226

NousResearch/Hermes-4-14B

Text Generation • 425k • Updated Jan 9 • 34.9k • • 151

NousResearch/Hermes-4-70B

Text Generation • 71B • Updated Sep 2, 2025 • 1.08k • • 193

NousResearch/Hermes-4-70B-FP8

Text Generation • 71B • Updated Sep 12, 2025 • 3.35k • 33

NousResearch/Hermes-4-14B-FP8

Text Generation • 15B • Updated Sep 3, 2025 • 11.2k • 25

NousResearch/Hermes-4-405B

Text Generation • 406B • Updated Sep 2, 2025 • 450 • • 91

NousResearch/Hermes-4-405B-FP8

Text Generation • 406B • Updated Sep 2, 2025 • 410 • 31

NousResearch/Hermes-4.3-36B-GGUF

Text Generation • 36B • Updated Dec 3, 2025 • 4.95k • 54

mlx-community/Hermes-4-14B-8bit

Text Generation • 15B • Updated Sep 3, 2025 • 260 • 2

alexcovo/Hermes-4.3-36B-mlx-8Bit

Text Generation • 36B • Updated Dec 11, 2025 • 217 • 1

hrktos-37/Hermes-4-70B-heretic

Text Generation • 71B • Updated Jan 2 • 8 • 2

unclecode/llama3-function-call-lora-adapter-240424

Updated Apr 25, 2024 • 4

unclecode/llama3-function-call-Q4_K_M_GGFU-240424

8B • Updated Apr 25, 2024 • 99 • 3

unclecode/tinyllama-function-call-lora-adapter-250424

Updated Apr 25, 2024

unclecode/tinyllama-function-call-Q4_K_M_GGFU-250424

1B • Updated May 1, 2024 • 347 • 4

mims-harvard/TxAgent-T1-Llama-3.1-8B

Text Generation • 8B • Updated Apr 10, 2025 • 325 • • 31

mradermacher/TxAgent-T1-Llama-3.1-8B-GGUF

8B • Updated Jul 11, 2025 • 263 • 2

DavidAU/Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B-GGUF

Text Generation • 25B • Updated Jul 28, 2025 • 443 • 10

DavidAU/Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-COGITO-Deep-Reasoning-32B-GGUF

Text Generation • 25B • Updated May 28, 2025 • 380 • 6

tensorblock/mims-harvard_TxAgent-T1-Llama-3.1-8B-GGUF

Text Generation • 8B • Updated Jan 27 • 49

DavidAU/Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B

Text Generation • 25B • Updated May 28, 2025 • 9 • 4

DavidAU/Qwen3-128k-30B-A3B-NEO-MAX-Imatrix-gguf

Text Generation • 31B • Updated Apr 28 • 3.74k • 37

DavidAU/Qwen3-30B-A1.5B-64K-High-Speed-NEO-Imatrix-MAX-gguf

Text Generation • 31B • Updated Apr 28 • 743 • 26

mradermacher/Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B-GGUF

25B • Updated Jul 31, 2025 • 163

mradermacher/Llama3.1-MOE-4X8B-Gated-IQ-Multi-Tier-Deep-Reasoning-32B-i1-GGUF

25B • Updated Jul 11, 2025 • 340

Prince-1/Hermes-4-14B-Onnx

Text Generation • Updated Apr 2

makaveli10/tinyllama-function-call-lora-adapter-250424-F16-GGUF

25.2M • Updated Jul 29, 2025 • 21

rogue-security/mcp-tool-use-quality-ranger-0.6b

Text Classification • 0.6B • Updated Sep 30, 2025 • 14

lmstudio-community/Hermes-4-405B-MLX-5bit

Text Generation • 406B • Updated Aug 26, 2025 • 312

lmstudio-community/Hermes-4-405B-MLX-4bit

Text Generation • 406B • Updated Aug 26, 2025 • 791 • 1