Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

1,832

Base only

Active filters: quantization

MilyaShams/Qwen3-1.7B-SparseGPT_4_8

2B • Updated Apr 8 • 2

MilyaShams/Qwen3-1.7B-SparseGPT_2_4_bs64

1B • Updated Apr 8 • 2

MilyaShams/Qwen3-1.7B-Pipe_PTQ_W8A8_Wanda_unstruct_0.5

2B • Updated Apr 8 • 4

MilyaShams/Qwen3-1.7B-Pipe_PTQ_W8A16_Wanda_2_4

0.6B • Updated Apr 19 • 2

MilyaShams/Qwen3-1.7B-Pipe_GPTQ_W4A16_SparseGPT_unstruct_0.5

2B • Updated Apr 8 • 2

Neural-ICE/Gemma-4-E4B-it-NVFP4

Text Generation • 6B • Updated Apr 8 • 749

MilyaShams/Qwen3-1.7B-Pipe_GPTQ_W8A8_SparseGPT_2_4

1B • Updated Apr 8 • 2

Neural-ICE/Gemma-4-E2B-it-NVFP4

Text Generation • 4B • Updated Apr 8 • 3.24k • 2

MilyaShams/Qwen3-1.7B-AutoRound_W4A16_iters400

2B • Updated Apr 8 • 1

MilyaShams/Qwen3-1.7B-Pipe_AWQ_W4A16_Wanda_unstruct_0.6

2B • Updated Apr 8 • 2

MilyaShams/Qwen3-1.7B-Pipe_AutoRound_W4A16_SparseGPT_unstruct_0.5

2B • Updated Apr 8 • 4

MilyaShams/Qwen3-1.7B-Pipe_Wanda_2_4_PTQ_W8A16

0.6B • Updated Apr 19 • 2

cs2764/GLM-5.1-FP8_dq4-mlx

Text Generation • 744B • Updated Apr 8 • 217

MilyaShams/Qwen3-1.7B-Pipe_SparseGPT_2_4_GPTQ_W8A8

1B • Updated Apr 8 • 2

MilyaShams/Qwen3-1.7B-Pipe_Wanda_unstruct_0.6_AWQ_W4A16

2B • Updated Apr 9 • 2

MilyaShams/Qwen3-1.7B-Pipe_Wanda_unstruct_0.5_AutoRound_W4A16

2B • Updated Apr 9 • 2

MilyaShams/Qwen3-1.7B-SmoothQuant_0.5_GPTQ_W8A8

2B • Updated Apr 9 • 1

MilyaShams/Qwen3-1.7B-SmoothQuant_0.5_AutoRound_W4A16

2B • Updated Apr 9 • 1

MilyaShams/Qwen3-1.7B-SmoothQuant_0.5_AutoRound_W8A8_ign

2B • Updated Apr 9 • 2

qubitron/LLaDA-8B-Quantized

Text Generation • Updated Apr 10

bbearforever/quantum-Qwen2.5-4-bit

Text Generation • Updated Apr 18

MilyaShams/Qwen3-1.7B-Pipe_SmoothQuant_0.5_PTQ_W8A8_Wanda_2_4

1B • Updated Apr 9 • 3

MilyaShams/Qwen3-1.7B-Pipe_Wanda_unstruct_0.5_PTQ_W8A8

2B • Updated Apr 9 • 2

MilyaShams/Qwen3-1.7B-Pipe_SparseGPT_unstruct_0.5_GPTQ_W4A16

2B • Updated Apr 9 • 2

MilyaShams/Qwen3-1.7B-Pipe_SparseGPT_unstruct_0.5_SmoothQuant_0.5_PTQ_W8A8

2B • Updated Apr 9 • 2

MilyaShams/Qwen3-1.7B-SmoothQuant_0.5_AutoRound_W8A8_iters200

2B • Updated Apr 9 • 2

rtrevinnoc/penumbra

ebircak/gemma-4-31B-it-GGUF

Text Generation • 31B • Updated Apr 9 • 140 • 1

MilyaShams/Qwen3-1.7B-SmoothQuant_0.5_AutoRound_W8A8

2B • Updated Apr 9 • 2

varjosoft/GLM-4.7-Flash-TQ3

13B • Updated Apr 9 • 5 • 1