Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

5,324

Base only

Active filters: llama.cpp

XavierSpycy/Meta-Llama-3-8B-Instruct-zh-10k-AWQ

Text Generation • Updated Jul 9, 2024 • 4

mgonzs13/stablelm-zephyr-3B-localmentor-GGUF

Text Generation • 3B • Updated Jul 3, 2024 • 324

google/gemma-2-2b-it-GGUF

3B • Updated Aug 27, 2024 • 206 • 89

google/gemma-2-2b-GGUF

3B • Updated Aug 2, 2024 • 29 • 20

chatpdflocal/llama3.1-8b-gguf

8B • Updated Mar 23 • 307 • 29

akshathmangudi/llama3.1-8b-gguf

Updated Jul 26, 2024

dahara1/llama-translate-gguf

8B • Updated Aug 14, 2024 • 111 • 16

jhilburn/gemma-inference

Text Generation • Updated Aug 7, 2024

ghost-x/ghost-8b-beta-1608-gguf

Text Generation • 8B • Updated Aug 26, 2024 • 160 • 6

PaulJusst/codegemma-7b-it-GGUF

Text Generation • 9B • Updated Sep 13, 2024

v000000/Typhon-Mixtral-v1-imatrix-v2.Q6_K-GGUF

Updated Sep 26, 2024 • 54

cstr/Ministral-8B-Instruct-2410-GGUF

8B • Updated Oct 17, 2024 • 37 • 1

mrcuddle/Lumimaid-v0.2-12B-Q4_K_M-GGUF

Text Generation • 12B • Updated Oct 20, 2024 • 25

Manel/Llama-3.1-8B-Instruct-Q4_K_M-GGUF

8B • Updated Nov 3, 2024 • 7

Manel/Llama-2-13b-chat-hf-Q4_0-GGUF

Text Generation • 13B • Updated Nov 3, 2024 • 5

dumb-dev/flan-t5-xxl-gguf

11B • Updated Oct 29, 2024 • 560 • 18

Manel/gemma-2-9b-Q4_0-GGUF

9B • Updated Nov 3, 2024 • 2

DiYaZeN/aya-sl-biz-8b

Text Generation • 8B • Updated Oct 31, 2024 • 10

shreyasmeher/ConflLlama

Text Classification • 8B • Updated Jul 8, 2025 • 58 • 4

dwikitheduck/gen-try1-Q4_K_M-GGUF

15B • Updated Nov 11, 2024

shreyasmeher/ConflLlama-Alt

Text Classification • 8B • Updated Nov 19, 2024 • 9 • 1

XeAI/LLaMa_3.2_3B_Instruct_Text2SQL-Q4_K_M-GGUF

Text Generation • Updated Nov 17, 2024 • 22

dwikitheduck/gen-sql-1-Q4_K_M-GGUF

8B • Updated Nov 18, 2024 • 1

jsjeon/SummLlama3.2-3B-Q4_K_M-GGUF

Updated Nov 19, 2024

dwikitheduck/gen-inst-1-Q4_K_M-GGUF

15B • Updated Nov 25, 2024

Vikhrmodels/Vikhr-Qwen-2.5-1.5B-Instruct-GGUF

2B • Updated Nov 26, 2024 • 491 • 5

McaTech/Nonet

Text Generation • 0.1B • Updated Jun 30, 2025 • 248 • 3

lianghsun/Llama-3.2-Taiwan-3B-Instruct-GGUF

Text Generation • 4B • Updated May 4 • 542 • 10

phymbert/Phi-3.5-MoE-instruct-GGUF

Text Generation • 42B • Updated Dec 29, 2024 • 34 • 1

carsenk/llama3.2_3b_122824_uncensored

Text Generation • 3B • Updated Dec 31, 2024 • 287 • • 3