Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

One-click Deployment

Inference Endpoints

Microsoft Foundry

Amazon SageMaker AI

Misc

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

9,204

Base only

Active filters: chat

CuckmeisterFuller/Hermes-3-Llama-3.2-3B-Q4-mlx

Text Generation • 0.5B • Updated Dec 27, 2024 • 33 • 1

fuzzy-mittenz/Thoth-Llama3.2-3B-IQ4_NL-GGUF

3B • Updated Feb 12, 2025 • 14 • 1

IntelligentEstate/Thoth_Warding-Llama-3B-IQ5_K_S-GGUF

3B • Updated Mar 7, 2025 • 9 • 4

QuantFactory/Control-Nanuq-8B-GGUF

8B • Updated Dec 27, 2024 • 191 • 2

phymbert/Qwen2-7B-Instruct-Q4_K_M-GGUF

Text Generation • 8B • Updated Dec 27, 2024 • 5

DonGan13/Qwen2.5-Coder-7B-Instruct-Q4_K_S-GGUF

Text Generation • 8B • Updated Dec 27, 2024 • 13

DonGan13/Qwen2.5-Coder-7B-Instruct-IQ3_XXS-GGUF

Text Generation • 8B • Updated Dec 27, 2024 • 10

DonGan13/Qwen2.5-14B-Instruct-Q3_K_S-GGUF

Text Generation • 15B • Updated Dec 27, 2024 • 18

finnvoorhees/coreml-Qwen2.5-Coder-1.5B-Instruct-4bit

Text Generation • Updated Dec 27, 2024 • 7

Goekdeniz-Guelmez/josie-3b-v6.0-epoch1

Text Generation • 3B • Updated Dec 27, 2024 • 11

tensorblock/QwQ-32B-Preview-GGUF

33B • Updated Jan 27 • 43

finnvoorhees/coreml-Qwen2.5-0.5B-Instruct-4bit

Text Generation • Updated Dec 28, 2024 • 59 • 1

finnvoorhees/coreml-Qwen2.5-Coder-0.5B-Instruct-4bit

Text Generation • Updated Dec 28, 2024 • 11

Lucy-in-the-Sky/QwQ-32B-Preview-Q4_K_M-GGUF

33B • Updated Dec 28, 2024 • 5

finnvoorhees/coreml-Qwen2.5-3B-Instruct-4bit

Text Generation • Updated Dec 28, 2024 • 16

itlwas/magnum-v4-22b-Q4_K_M-GGUF

Text Generation • 22B • Updated Dec 28, 2024 • 188

mrG46487/CodeQwen1.5-7B-Chat-Q4_K_M-GGUF

Text Generation • 7B • Updated Dec 29, 2024 • 5

mradermacher/EZO-Qwen2.5-32B-Instruct-GGUF

33B • Updated Dec 29, 2024 • 82

mradermacher/EZO-Qwen2.5-32B-Instruct-i1-GGUF

33B • Updated Dec 29, 2024 • 91

evilfreelancer/o1_t-lite-it-1.0_lora

Question Answering • Updated Dec 30, 2024 • 8 • 3

Goekdeniz-Guelmez/josie-3b-v6.0

Text Generation • 3B • Updated Dec 29, 2024 • 8 • 1

Goekdeniz-Guelmez/josie-7b-v6.0

Text Generation • 8B • Updated Dec 29, 2024 • 10 • 2

Goekdeniz-Guelmez/josie-7b-v6.0-Q4_K_M-GGUF

Text Generation • 8B • Updated Dec 29, 2024 • 22 • 1

Goekdeniz-Guelmez/josie-3b-v6.0-Q5_K_M-GGUF

Text Generation • 3B • Updated Dec 29, 2024 • 12

nitky/Llama-3.3-SuperSwallowX-70B-RP-v0.1

Text Generation • 71B • Updated Dec 30, 2024 • 8

mradermacher/Qwen2.5-32B-Instruct-abliterated-GGUF

33B • Updated Apr 29, 2025 • 2.95k • 2

mradermacher/Qwen1.5-4B-Chat-GGUF

4B • Updated Jul 31, 2025 • 67

mradermacher/Qwen1.5-4B-Chat-i1-GGUF

4B • Updated Dec 30, 2024 • 31

evilfreelancer/o1_t-lite-it-1.0_gguf

Question Answering • 8B • Updated Dec 30, 2024 • 44 • 2

yujiepan/qvq-preview-tiny-random

Image-Text-to-Text • 4.9M • Updated Dec 30, 2024 • 10