Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

One-click Deployment

Inference Endpoints

Microsoft Foundry

Amazon SageMaker AI

Misc

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

1,112

Base only

Active filters: rag

ajesujoba/AfriHuBERT

Updated Jun 24 • 348 • 6

belyakoff/llama-3.2-3b-instruct-fine-tuned

Text Generation • 3B • Updated Sep 30, 2024 • 10 • 3

belyakoff/llama-3.2-3b-instruct-fine-tuned-gptq-8bit

Text Generation • 1B • Updated Sep 30, 2024 • 17 • 2

Anirudh6778/t5_fineTuned_RAFT

Text Generation • 0.2B • Updated Oct 23, 2024 • 5

belyakoff/SmolLM2-360M-Instruct-FT

Text Generation • 0.4B • Updated Nov 6, 2024 • 7 • 2

mradermacher/docsgpt-7b-mistral-GGUF

7B • Updated Nov 14, 2024 • 25

mradermacher/docsgpt-7b-mistral-i1-GGUF

7B • Updated Nov 15, 2024 • 325

tensorblock/SmolLM2-360M-Instruct-FT-GGUF

Text Generation • 0.4B • Updated Jan 27 • 47 • 1

mys/minilm

Updated Dec 5, 2024

itlwas/SmolLM2-360M-Instruct-FT-Q4_K_M-GGUF

Text Generation • 0.4B • Updated Dec 18, 2024 • 26

tensorblock/docsgpt-7b-mistral-GGUF

7B • Updated Jan 27 • 51

doubleyyh/exit-gemma-7b

Updated Dec 21, 2024

doubleyyh/exit-gemma-2b

Updated Dec 21, 2024 • 2

doubleyyh/email-tuned-qwen2-lora

Text Generation • Updated Dec 26, 2024 • 3

rokeya71/granite-embedding-125m-english-onnx

Feature Extraction • Updated Dec 27, 2024 • 5 • 1

cnmoro/Qwen0.5b-RagSemanticChunker

Text Generation • 0.5B • Updated Feb 9, 2025 • 5 • 4

Josephgflowers/Phinance-Phi-3.5-mini-instruct-finance-v0.3

4B • Updated Feb 7, 2025 • 7 • 1

altaidevorg/bge-m3-distill-8l-letsearch

Updated Jan 20, 2025

cnmoro/Qwen3b-RagSemanticChunker

Text Generation • 3B • Updated Jan 21, 2025 • 6 • 2

mradermacher/Qwen0.5b-RagSemanticChunker-GGUF

0.5B • Updated Jan 21, 2025 • 67

mradermacher/Qwen0.5b-RagSemanticChunker-i1-GGUF

0.5B • Updated Jan 21, 2025 • 88

rakmik/z

0.6B • Updated Jan 22, 2025 • 5

silma-ai/SILMA-Kashif-2B-Instruct-v1.0

Text Generation • 3B • Updated Jun 11, 2025 • 994 • • 24

mradermacher/SILMA-Kashif-2B-Instruct-v1.0-GGUF

3B • Updated Jan 28, 2025 • 66 • 1

mradermacher/SILMA-Kashif-2B-Instruct-v1.0-i1-GGUF

3B • Updated Jan 28, 2025 • 40 • 1

h1tak3/rag-phishing-detector

0.6B • Updated Feb 6, 2025 • 7 • 1

mradermacher/Phinance-Phi-3.5-mini-instruct-finance-v0.3-GGUF

4B • Updated Feb 8, 2025 • 136

tjohn327/scion-minilm-l6-v2

22.7M • Updated Feb 27, 2025 • 2

tensorblock/SILMA-Kashif-2B-Instruct-v1.0-GGUF

Text Generation • 3B • Updated Jan 27 • 96

DISLab/Ext2Gen-8B-R2

Question Answering • 8B • Updated Mar 19, 2025 • 16 • 7