Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

24,097

Base only

Active filters: llama-cpp

Marlon81/Llama-3.1-SauerkrautLM-8b-Instruct-Q4_K_M-GGUF

8B • Updated Jul 25, 2024 • 10

kkwagh999/Meta-Llama-3.1-8B-Instruct-Q4_K_M-GGUF

Text Generation • 8B • Updated Jul 25, 2024 • 15

nanowell/deepseek-math-7b-rl-Q5_K_M-GGUF

7B • Updated Jul 25, 2024 • 18

pmathis89/Meta-Llama-3.1-8B-Instruct-Q6_K-GGUF

Text Generation • 8B • Updated Jul 25, 2024 • 6

pervll/bge-reranker-v2-gemma-Q4_K_M-GGUF

Text Classification • 3B • Updated Jul 26, 2024 • 36 • 3

jiejie0301/Llama3-8B-Chinese-Chat-Q8_0-GGUF

Text Generation • 8B • Updated Jul 26, 2024 • 4

huoxu/bge-large-en-v1.5-Q8_0-GGUF

Feature Extraction • 0.3B • Updated Jul 26, 2024 • 4

Kondara/Qwen2-7B-Instruct-Q4_K_M-GGUF

Text Generation • 8B • Updated Jul 26, 2024 • 2

Kondara/Yi-1.5-9B-Chat-Q4_K_M-GGUF

9B • Updated Jul 26, 2024

Kondara/Phi-3-mini-4k-instruct-Q4_K_M-GGUF

Text Generation • 4B • Updated Jul 26, 2024 • 7

Kondara/Phi-3-mini-128k-instruct-Q4_K_M-GGUF

Text Generation • 4B • Updated Jul 26, 2024 • 10

Kondara/Phi-3-medium-4k-instruct-Q4_K_M-GGUF

Text Generation • 14B • Updated Jul 26, 2024 • 1

Kondara/Phi-3-medium-128k-instruct-Q4_K_M-GGUF

Text Generation • 14B • Updated Jul 26, 2024 • 9

BafS/Phi-3-mini-4k-instruct-Q4_K_M-GGUF

4B • Updated Jul 26, 2024 • 5

Morgen0052/Meta-Llama-3.1-8B-Instruct-Q8_0-GGUF

Text Generation • 8B • Updated Jul 26, 2024 • 3

Kondara/gemma-2-9b-Q4_K_M-GGUF

Text Generation • 9B • Updated Jul 26, 2024 • 10

Kondara/falcon-7b-instruct-Q4_K_M-GGUF

7B • Updated Jul 26, 2024 • 9

jasonj2020/Llama3-8B-Chinese-Chat-Q4_K_M-GGUF

Text Generation • 8B • Updated Jul 26, 2024 • 3

Kondara/falcon-11B-Q4_K_M-GGUF

11B • Updated Jul 26, 2024 • 4

Morgen0052/Mistral-7B-Instruct-v0.3-Q8_0-GGUF

7B • Updated Jul 26, 2024 • 28

SteelQuants/L3.1-Meta-In-15B-Q6_K-GGUF

15B • Updated Jul 26, 2024 • 5

Kondara/vicuna-7b-v1.5-16k-Q4_K_M-GGUF

7B • Updated Jul 26, 2024 • 8

Kondara/vicuna-13b-v1.5-Q4_K_M-GGUF

13B • Updated Jul 26, 2024 • 6

ooff123/lawyer-llama-13b-v2-Q4_K_M-GGUF

13B • Updated Jul 26, 2024 • 41 • 3

Kondara/Mistral-7B-Instruct-v0.3-Q4_K_M-GGUF

7B • Updated Jul 26, 2024 • 1

Ransss/mini-magnum-12b-v1.1-Q8_0-GGUF

12B • Updated Jul 26, 2024 • 3

Kondara/cendol-llama2-7b-inst-Q4_K_M-GGUF

7B • Updated Aug 28, 2024 • 3

Kondara/SeaLLMs-v3-7B-Chat-Q4_K_M-GGUF

8B • Updated Jul 26, 2024 • 18

NikolayKozloff/Replete-LLM-Qwen2-7b_Beta-Preview-Q8_0-GGUF

8B • Updated Jul 26, 2024 • 14 • 2

ui7/Sujet-Finance-8B-v0.1-Q5_K_M-GGUF

Text Generation • 8B • Updated Jul 26, 2024 • 2