Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

1,828

Base only

Active filters: quantization

Shifusen/Mistral-Large-Instruct-2411-NVFP4

69B • Updated Feb 12 • 76

ludovicoYIN/MiniMax-M2-BF16-W4A16

Text Generation • 34B • Updated Feb 13 • 5

JasperHG90/ms-marco-minilm-l12-hindsight-reranker

Text Classification • Updated Mar 22 • 431

JasperHG90/minilm-l12-v2-hindsight-embeddings

Feature Extraction • Updated Feb 13 • 363

JasperHG90/distilbert-hindsight-ner

Token Classification • Updated Feb 13 • 331

Akicou/Ministral-3-3B-Instruct-2512-BF16-Heretic-GGUF

3B • Updated Feb 13 • 311 • 1

petergilani/MiniMax-M2.5-mix3-6bit

229B • Updated Feb 14 • 26 • 1

Shifusen/Lumimaid-v0.2-123B-NVFP4

Text Generation • 69B • Updated Feb 14 • 2

namgyu-youn/Qwen3-8B-tenary

Text Generation • 8B • Updated Feb 23 • 3

namgyu-youn/Qwen3-8B-greedy

Text Generation • 8B • Updated Feb 14 • 2

OpenTransformer/unary-quantization-research

mukesh3444/LTX2-Rapid-Merges-GGUF

Image-Text-to-Video • 19B • Updated Feb 15 • 148

Al-hin/Llama-3.2-1B-Instruct-IMat-GGUF

Text Generation • 1B • Updated Feb 17 • 80

Joysulem/FireEcho

Text Generation • Updated Feb 17 • 3

Ne7/LTX2-Rapid-Merges-GGUF

Image-Text-to-Video • 19B • Updated Feb 17 • 101

tobleron900/Nanbeige-4.1-3B-Instruct-GGUF

Updated Feb 26 • 8 • 1

namgyu-youn/EXAONE-4.0-1.2B-LLMC-AWQ-W4

1B • Updated Feb 18 • 4

oshkorinova/MamayLM-Gemma-3-12B-IT-v1.0-FP8-Dynamic

Text Generation • 12B • Updated Feb 25 • 3

EricRollei/HunyuanImage-3-INT8-v2

Text-to-Image • 83B • Updated Feb 21 • 41 • 1

EricRollei/HunyuanImage-3-NF4-v2

Text-to-Image • 83B • Updated Feb 21 • 91 • 1

EricRollei/HunyuanImage-3.0-Instruct-INT8-v2

Text-to-Image • 83B • Updated Feb 22 • 33 • 1

EricRollei/HunyuanImage-3.0-Instruct-NF4-v2

Text-to-Image • 83B • Updated Feb 22 • 260 • 5

EricRollei/HunyuanImage-3.0-Instruct-Distil-NF4-v2

Text-to-Image • 83B • Updated Feb 22 • 286 • 14

EricRollei/HunyuanImage-3.0-Instruct-Distil-INT8-v2

Text-to-Image • 83B • Updated Feb 22 • 80 • 7

Tushe/nllb-200-600M-ct2-float16

Translation • Updated Feb 22 • 8 • 1

Tushe/nllb-200-600M-ct2-int8

Translation • Updated Feb 22 • 115

Jakubrd4/Bielik-11B-v2.3-Instruct-QuIP-2bit

Text Generation • 0.6B • Updated Feb 22 • 2

Tushe/nllb-200-600M-nf4-custom-weights-bare-metal

Translation • 0.4B • Updated Feb 24 • 3

MO7YW4NG/ms-marco-MiniLM-L-6-v2-4bit-nf4

Text Ranking • 23.1M • Updated Feb 24 • 1

groxaxo/qwen3-embed-8b-gptq

Feature Extraction • 8B • Updated Feb 26 • 605 • 2