Inference Providers
Active filters: int4
ISTA-DASLab/Llama-3.1-8B-Instruct-MR-GPTQ-nvfp
Image-Text-to-Text
• 5B • Updated • 5
ISTA-DASLab/Llama-3.1-8B-Instruct-MR-GPTQ-mxfp
Image-Text-to-Text
• 5B • Updated • 4
huawei-csl/Qwen3-1.7B-4bit-SINQ
Text Generation
• 1B • Updated • 3
• 5
huawei-csl/Qwen3-1.7B-4bit-ASINQ
Text Generation
• 1B • Updated • 4
• 5
huawei-csl/Qwen3-32B-4bit-SINQ
Text Generation
• 18B • Updated • 16
• 7
huawei-csl/Qwen3-14B-4bit-SINQ
Text Generation
• 9B • Updated • 6
• 5
huawei-csl/Qwen3-14B-4bit-ASINQ
Text Generation
• 9B • Updated • 1
• 6
huawei-csl/Qwen3-32B-4bit-ASINQ
Text Generation
• 18B • Updated • 7
• 8
ModelCloud/GLM-4.6-GPTQMODEL-W4A16-v1
Text Generation
• 357B • Updated • 8
ModelCloud/GLM-4.6-GPTQMODEL-W4A16-v2
Text Generation
• 357B • Updated • 2
• 1
PangaiaSoftware/YanoljaNEXT-Rosetta-4B-onnx
Translation
• Updated • 3
• 2
RedHatAI/NVIDIA-Nemotron-Nano-9B-v2-quantized.w4a16
Text Generation
• 2B • Updated • 2.3k
• 5
ModelCloud/GLM-4.6-REAP-268B-A32B-GPTQMODEL-W4A16
Text Generation
• 269B • Updated • 5
• 2
AhtnaGlen/phi-4-mini-instruct-int4-sym-npu-ov
Text Generation
• Updated • 34
tencent/DeepSeek-V3.1-Terminus-W4AFP8
Text Generation
• 349B • Updated • 478
• 16
ModelCloud/MiniMax-M2-GPTQMODEL-W4A16
Text Generation
• 229B • Updated • 10
• 3
ModelCloud/Marin-32B-Base-GPTQMODEL-W4A16
Text Generation
• 33B • Updated • 1
• 1
ModelCloud/Marin-32B-Base-GPTQMODEL-AWQ-W4A16
Text Generation
• 33B • Updated • 6
• 2
huawei-csl/Apertus-8B-2509-4bit-SINQ
Text Generation
• 5B • Updated • 5
• 2
huawei-csl/Apertus-8B-2509-4bit-ASINQ
Text Generation
• 5B • Updated • 292
• 3
ModelCloud/Granite-4.0-H-1B-GPTQMODEL-W4A16
Text Generation
• 1B • Updated • 1
• 1
ModelCloud/Granite-4.0-H-350M-GPTQMODEL-W4A16
Text Generation
• 0.3B • Updated • 2
• 1
ModelCloud/Brumby-14B-Base-GPTQMODEL-W4A16
Text Generation
• 15B • Updated • 1
ModelCloud/Brumby-14B-Base-GPTQMODEL-W4A16-v2
Text Generation
• 15B • Updated • 4
• 1
SherlockID365/Qwen3-VL-8B-Instruct-quantized.w4a16
Image-Text-to-Text
• 3B • Updated • 307
• 1
Ishant86/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-compressed-tensors-int4
6B • Updated • 10
zandzpider/Qwen3-30B-A3B-abliterated-erotic-autoround-int4
0.6B • Updated • 16
• 1
ikarius/Granite-3.2-8b-instruct-Abliterated-gs128-GPTQ-INT4
Text Generation
• 8B • Updated • 4
• 1
huawei-csl/Kimi-Linear-48B-A3B-Instruct-4bit-SINQ
Text Generation
• 27B • Updated • 16
• 3
huawei-csl/Qwen3-Next-80B-A3B-Instruct-4bit-SINQ
Text Generation
• Updated • 11
• 2