Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

One-click Deployment

Inference Endpoints

Microsoft Foundry

Amazon SageMaker AI

Misc

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

9,209

Base only

Active filters: chat

RedHatAI/Qwen2.5-14B-Instruct-quantized.w8a8

Text Generation • 15B • Updated Dec 17, 2024 • 102

imkebe/QwQ-32B-Preview-rk3588-1.1.3

Text Generation • Updated Dec 10, 2024 • 6

Mahmoud-Selim/Llama

Text Generation • Updated Dec 10, 2024

Mahmoud-Selim/Llama-Instruct

Text Generation • 1.7M • Updated Dec 10, 2024 • 7

netnk9151/Llama-DNA-1.0-8B-Instruct-Q8_0-GGUF

Text Generation • 8B • Updated Dec 10, 2024 • 10

mradermacher/Llama-DNA-1.0-8B-Instruct-GGUF

8B • Updated Dec 11, 2024 • 92

mradermacher/Llama-DNA-1.0-8B-Instruct-i1-GGUF

8B • Updated Dec 11, 2024 • 609

sethut/QwQ-32B-Preview-Q8_0-GGUF

33B • Updated Dec 11, 2024 • 4

QuantFactory/Llama-DNA-1.0-8B-Instruct-GGUF

Text Generation • 8B • Updated Dec 11, 2024 • 79 • 2

madroid/Qwen2.5-3B-Instruct-4bit-mlx

Text Generation • 0.5B • Updated Dec 11, 2024 • 18

NousResearch/Hermes-3-Llama-3.2-3B-GGUF

3B • Updated Dec 18, 2024 • 4.27k • 73

mlx-community/Josiefied-Qwen2.5-14B-Instruct-abliterated-v4-4-bit

Text Generation • 2B • Updated Dec 11, 2024 • 59 • 1

eligapris/Qwen2.5-Coder-32B-Instruct-Q4_K_M-GGUF

Text Generation • 33B • Updated Dec 11, 2024 • 9

tensorblock/Sailor-0.5B-Chat-GGUF

0.6B • Updated Jan 27 • 78

mlx-community/QwQ-32B-Coder-Fusion-9010-4bit

Text Generation • 5B • Updated Dec 11, 2024 • 23 • 1

tensorblock/Experiment31-7B-GGUF

Text Generation • 7B • Updated Jan 27 • 8

chende2024/QwQ-32B-Preview-Q4_0-GGUF

33B • Updated Dec 11, 2024 • 4 • 1

chende2024/Qwen2.5-1.5B-Instruct-Q4_K_M-GGUF

Text Generation • 2B • Updated Dec 11, 2024 • 24

lianghsun/Llama-3.2-Taiwan-1B-Instruct

Text Generation • 1B • Updated May 4 • 3

zai-org/VisionReward-Image

Text Generation • Updated Jan 20, 2025 • 11

NexaAI/OmniAudio-2.6B

Audio-Text-to-Text • 0.6B • Updated Dec 13, 2024 • 485 • 289

bartowski/Hermes-3-Llama-3.2-3B-GGUF

Text Generation • 3B • Updated Dec 11, 2024 • 14.3k • 15

ggml-org/Qwen2.5-Coder-1.5B-32B-speculative-GGUF

Text Generation • 2B • Updated Dec 11, 2024 • 84 • 5

mlx-community/Hermes-3-Llama-3.2-3B-4bit

Text Generation • 0.5B • Updated Dec 11, 2024 • 215 • 1

mlx-community/Hermes-3-Llama-3.2-3B-8bit

Text Generation • 0.9B • Updated Dec 11, 2024 • 72 • 1

mlx-community/Hermes-3-Llama-3.2-3B-bf16

Text Generation • 3B • Updated Dec 11, 2024 • 29

mradermacher/Holland-4B-V1-GGUF

5B • Updated Apr 22, 2025 • 114 • 1

JackeyLai/Qwen2.5-3B-Instruct-Q4_0-GGUF

Text Generation • 3B • Updated Dec 12, 2024 • 21

JackeyLai/Qwen2.5-7B-Instruct-Q4_0-GGUF

Text Generation • 8B • Updated Dec 12, 2024 • 24

cphan-intersystems/Qwen2.5-Coder-32B-Instruct-Q4_K_M-GGUF

Text Generation • 33B • Updated Dec 12, 2024 • 7