-
-
-
-
-
-
Inference Providers
Active filters:
int4
RiverkanIT/Ling-mini-2.0-Quantized
Text Generation
•
Updated
•
2
ForeseeLab/foreseeai-qwen3-4b-iot-int4
Text Generation
•
4B
•
Updated
•
2
•
1
ISTA-DASLab/Llama-3.1-8B-Instruct-MR-GPTQ-nvfp
Image-Text-to-Text
•
5B
•
Updated
•
22
ISTA-DASLab/Llama-3.1-8B-Instruct-MR-GPTQ-mxfp
Image-Text-to-Text
•
5B
•
Updated
•
31
huawei-csl/Qwen3-1.7B-4bit-SINQ
Text Generation
•
1B
•
Updated
•
11
•
5
huawei-csl/Qwen3-1.7B-4bit-ASINQ
Text Generation
•
1B
•
Updated
•
9
•
5
huawei-csl/Qwen3-32B-4bit-SINQ
Text Generation
•
18B
•
Updated
•
11
•
7
huawei-csl/Qwen3-14B-4bit-SINQ
Text Generation
•
9B
•
Updated
•
10
•
5
huawei-csl/Qwen3-14B-4bit-ASINQ
Text Generation
•
9B
•
Updated
•
12
•
6
huawei-csl/Qwen3-32B-4bit-ASINQ
Text Generation
•
18B
•
Updated
•
12
•
8
ModelCloud/GLM-4.6-GPTQMODEL-W4A16-v1
Text Generation
•
357B
•
Updated
•
3
ModelCloud/GLM-4.6-GPTQMODEL-W4A16-v2
Text Generation
•
357B
•
Updated
•
3
•
1
PangaiaSoftware/YanoljaNEXT-Rosetta-4B-onnx
Translation
•
Updated
•
2
•
2
RedHatAI/NVIDIA-Nemotron-Nano-9B-v2-quantized.w4a16
Text Generation
•
2B
•
Updated
•
177
•
5
ModelCloud/GLM-4.6-REAP-268B-A32B-GPTQMODEL-W4A16
Text Generation
•
269B
•
Updated
•
48
•
2
AhtnaGlen/phi-4-mini-instruct-int4-sym-npu-ov
Text Generation
•
Updated
•
9
tencent/DeepSeek-V3.1-Terminus-W4AFP8
Text Generation
•
349B
•
Updated
•
1.16k
•
15
ModelCloud/MiniMax-M2-GPTQMODEL-W4A16
Text Generation
•
229B
•
Updated
•
58
•
3
ModelCloud/Marin-32B-Base-GPTQMODEL-W4A16
Text Generation
•
33B
•
Updated
•
7
•
1
ModelCloud/Marin-32B-Base-GPTQMODEL-AWQ-W4A16
Text Generation
•
33B
•
Updated
•
5
•
1
huawei-csl/Apertus-8B-2509-4bit-SINQ
Text Generation
•
5B
•
Updated
•
9
•
2
huawei-csl/Apertus-8B-2509-4bit-ASINQ
Text Generation
•
5B
•
Updated
•
14
•
2
ModelCloud/Granite-4.0-H-1B-GPTQMODEL-W4A16
Text Generation
•
1B
•
Updated
•
3
ModelCloud/Granite-4.0-H-350M-GPTQMODEL-W4A16
Text Generation
•
0.3B
•
Updated
•
22
ModelCloud/Brumby-14B-Base-GPTQMODEL-W4A16
Text Generation
•
15B
•
Updated
•
3
ModelCloud/Brumby-14B-Base-GPTQMODEL-W4A16-v2
Text Generation
•
15B
•
Updated
•
3
SherlockID365/Qwen3-VL-8B-Instruct-quantized.w4a16
Image-Text-to-Text
•
3B
•
Updated
•
68
•
1
Ishant86/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-compressed-tensors-int4
6B
•
Updated
•
2
zandzpider/Qwen3-30B-A3B-abliterated-erotic-autoround-int4
0.6B
•
Updated
•
7
ikarius/Granite-3.2-8b-instruct-Abliterated-gs128-GPTQ-INT4
Text Generation
•
8B
•
Updated
•
15
•
1