Inference Providers
Active filters: vLLM
mistralai/Mistral-Medium-3.5-128B
128B • Updated • 21.3k
• 297
mistralai/Mistral-Medium-3.5-128B-EAGLE
Updated • 364
• 34
mistralai/Mistral-Small-4-119B-2603
119B • Updated • 63.1k
• 372
mistralai/Mistral-Small-4-119B-2603-eagle
Updated • 261
• 51
olka-fi/Mistral-Medium-3.5-128B-MXFP4
Text Generation
• 128B • Updated • 733
• 2
JunHowie/Qwen3-4B-Instruct-2507-GPTQ-Int4
Text Generation
• 4B • Updated • 34.4k
• 4
Text Generation
• Updated • 3.32k
• 28
QuantTrio/GLM-4.7-Flash-AWQ
Text Generation
• 31B • Updated • 50.3k
• 13
Image-Text-to-Text
• 10B • Updated • 237k
• 13
mistralai/Mistral-Small-4-119B-2603-NVFP4
Updated • 1.03k
• 85
QuantTrio/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2-AWQ
Image-Text-to-Text
• 28B • Updated • 121k
• 13
QuantTrio/MiniMax-M2.7-AWQ
Text Generation
• 229B • Updated • 28.8k
• 7
Text Generation
• 754B • Updated • 1.06k
• 6
QuantTrio/Qwen3.6-27B-AWQ
Image-Text-to-Text
• 28B • Updated • 264k
• 9
QuantTrio/Qwen3.6-27B-AWQ-6Bit
Image-Text-to-Text
• 28B • Updated • 13.8k
• 6
RecViking/Mistral-Medium-3.5-128B-NVFP4
74B • Updated • 5.72k
• 2
cyankiwi/Mistral-Medium-3.5-128B-AWQ-INT4
25B • Updated • 280
• 1
mradermacher/Mistral-Medium-3.5-128B-GGUF
125B • Updated • 935
• 1
model-scope/glm-4-9b-chat-GPTQ-Int4
Text Generation
• 9B • Updated • 99
• 6
model-scope/glm-4-9b-chat-GPTQ-Int8
Text Generation
• 9B • Updated • 10
• 2
tclf90/qwen2.5-72b-instruct-gptq-int4
Text Generation
• 73B • Updated • 94
• 2
tclf90/qwen2.5-72b-instruct-gptq-int3
Text Generation
• 69B • Updated • 75
prithivMLmods/Nu2-Lupi-Qwen-14B
Text Generation
• 15B • Updated • 6
• 2
mradermacher/Nu2-Lupi-Qwen-14B-GGUF
15B • Updated • 161
• 1
mradermacher/Nu2-Lupi-Qwen-14B-i1-GGUF
15B • Updated • 411
• 1
JunHowie/Qwen3-0.6B-GPTQ-Int4
Text Generation
• 0.6B • Updated • 373
• 1
JunHowie/Qwen3-0.6B-GPTQ-Int8
Text Generation
• 0.6B • Updated • 18
JunHowie/Qwen3-1.7B-GPTQ-Int4
Text Generation
• 2B • Updated • 2.69k
• 1
JunHowie/Qwen3-1.7B-GPTQ-Int8
Text Generation
• 2B • Updated • 16
JunHowie/Qwen3-32B-GPTQ-Int4
Text Generation
• 33B • Updated • 28.6k
• 4