Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

5,324

Base only

Active filters: llama.cpp

google/gemma-2b-it-GGUF

3B • Updated Jun 27, 2024 • 80 • 24

google/gemma-2b-GGUF

3B • Updated Jun 27, 2024 • 79 • 20

iAkashPaul/Indic-gemma-2b-finetuned-sft-Navarasa-GGUF

3B • Updated Mar 8, 2024 • 29 • 3

MrOvkill/gemma-2-inference-endpoint-GGUF

Text Generation • Updated Mar 11, 2024 • 6

google/gemma-1.1-7b-it-GGUF

9B • Updated Jun 27, 2024 • 22

google/gemma-1.1-2b-it-GGUF

3B • Updated Jun 27, 2024 • 1 • 21

webbigdata/C3TR-Adapter_gguf

Translation • 9B • Updated Aug 14, 2024 • 136 • 26

google/codegemma-2b-GGUF

Text Generation • 3B • Updated Jun 27, 2024 • 91 • 35

google/codegemma-7b-GGUF

Text Generation • 9B • Updated Jun 27, 2024 • 27 • 26

google/codegemma-7b-it-GGUF

Text Generation • 9B • Updated Jun 27, 2024 • 76 • 65

pacozaa/bonito-gguf

7B • Updated Apr 14, 2024 • 31

pmking27/PrathameshLLM-2B-GGUF

3B • Updated Apr 9, 2024 • 890 • 1

teleprint-me/cyberpunk-valerie-v0.1

Text Generation • 90.1M • Updated Apr 18, 2024 • 68 • 1

qwp4w3hyb/Meta-Llama-3-8B-Instruct-iMat-GGUF

Text Generation • 8B • Updated Apr 29, 2024 • 1.81k • 6

mgonzs13/Mistroll-7B-v2.2-GGUF

Text Generation • 7B • Updated Apr 29, 2024 • 23

mgonzs13/ladybird-base-7B-v8-GGUF

Text Generation • 7B • Updated Apr 29, 2024 • 434

google/codegemma-1.1-2b-GGUF

Text Generation • 3B • Updated Jun 27, 2024 • 16 • 6

google/codegemma-1.1-7b-it-GGUF

Text Generation • 9B • Updated Jun 27, 2024 • 1 • 14

openbmb/MiniCPM-Llama3-V-2_5-gguf

Updated Feb 27, 2025 • 2.88k • 217

QuantFactory/Ahma-3B-GGUF

Text Generation • 4B • Updated Jul 2, 2025 • 540 • 2

mgonzs13/TextBase-7B-v0.1-GGUF

Text Generation • 7B • Updated Jun 11, 2024 • 41

QuantFactory/TextBase-7B-v0.1-GGUF

Text Generation • 7B • Updated Jun 18, 2024 • 117

njwright92/ComicBot_v.2-gguf

Text Generation • 7B • Updated Aug 30, 2024 • 71

Irathernotsay/qwen2-1.5B-medical_qa-Finetune

Text Generation • 2B • Updated Jul 17, 2024 • 5

palusi/Qwen2-0.5B-Instruct-GGUF

0.5B • Updated Jun 27, 2024 • 82

XavierSpycy/Meta-Llama-3-8B-Instruct-zh-10k

Text Generation • Updated Jul 9, 2024 • 12 •

ruslanmv/Medical-Llama3-v2-Q4_K_M-GGUF

8B • Updated Jun 30, 2024 • 17

XavierSpycy/Meta-Llama-3-8B-Instruct-zh-10k-GGUF

Text Generation • Updated Jul 9, 2024 • 26

XavierSpycy/Meta-Llama-3-8B-Instruct-zh-10k-GPTQ

Text Generation • Updated Jul 9, 2024 • 6

zhhan/Phi-3-mini-4k-instruct_gguf_derived

Summarization • 4B • Updated Jul 2, 2024 • 25