Inference Providers
Active filters: vLLM
QuantTrio/Qwen3-Coder-30B-A3B-Instruct-GPTQ-Int8
Text Generation
• 31B • Updated • 774
• 8
QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
Text Generation
• 31B • Updated • 396k
• 8
EliovpAI/Qwen3-14B-FP8-KV
Text Generation
• 15B • Updated • 3
• 2
Image-Text-to-Text
• 17B • Updated • 1.5k
• 19
QuantTrio/Seed-OSS-36B-Instruct-AWQ
Text Generation
• 36B • Updated • 458
• 8
QuantTrio/Seed-OSS-36B-Instruct-GPTQ-Int8
Text Generation
• 36B • Updated • 103
• 4
QuantTrio/Seed-OSS-36B-Instruct-GPTQ-Int4
Text Generation
• 36B • Updated • 13
• 5
QuantTrio/Seed-OSS-36B-Instruct-GPTQ-Int3
Text Generation
• 34B • Updated • 8
• 3
amakhov/tiny-random-llama
Text Generation
• 4.18M • Updated • 68
Text Generation
• 41B • Updated • 4
• 2
QuantTrio/DeepSeek-V3.1-AWQ
Text Generation
• 684B • Updated • 306
• 5
QuantTrio/DeepSeek-V3.1-AWQ-Fp16Mix
Text Generation
• 684B • Updated • 15
• 1
QuantTrio/DeepSeek-V3.1-AWQ-Lite
Text Generation
• 684B • Updated • 282
• 3
JunHowie/Qwen3-4B-Instruct-2507-GPTQ-Int4
Text Generation
• 4B • Updated • 2.75k
• 4
JunHowie/Qwen3-4B-Instruct-2507-GPTQ-Int8
Text Generation
• 4B • Updated • 368
JunHowie/Qwen3-4B-Thinking-2507-GPTQ-Int4
Text Generation
• 4B • Updated • 137
• 1
JunHowie/Qwen3-4B-Thinking-2507-GPTQ-Int8
Text Generation
• 4B • Updated • 12
• 2
JunHowie/Qwen3-30B-A3B-Instruct-2507-GPTQ-Int4
Text Generation
• 31B • Updated • 1.86k
JunHowie/Qwen3-30B-A3B-Instruct-2507-GPTQ-Int8
Text Generation
• 31B • Updated • 6
JunHowie/Qwen3-30B-A3B-Thinking-2507-GPTQ-Int4
Text Generation
• 31B • Updated • 144
JunHowie/Qwen2-7B-Instruct-GPTQ-Int4
Text Generation
• 8B • Updated • 4.27k
JunHowie/Qwen2-7B-Instruct-GPTQ-Int8
Text Generation
• 8B • Updated EliovpAI/Deepseek-R1-0528-Qwen3-8B-FP8-KV
Text Generation
• 8B • Updated • 65
JunHowie/Qwen3-30B-A3B-Thinking-2507-GPTQ-Int8
Text Generation
• 31B • Updated • 5
JunHowie/Seed-OSS-36B-Instruct-GPTQ-Int4
Text Generation
• 36B • Updated • 2
JunHowie/Seed-OSS-36B-Instruct-GPTQ-Int8
Text Generation
• 36B • Updated QuantTrio/Qwen3-VL-235B-A22B-Instruct-AWQ
Text Generation
• 236B • Updated • 3.13k
• 13
QuantTrio/Qwen3-VL-235B-A22B-Instruct-FP8
Text Generation
• Updated • 137
QuantTrio/Qwen3-VL-235B-A22B-Thinking-AWQ
Text Generation
• 236B • Updated • 1.4k
• 8
QuantTrio/Qwen3-VL-235B-A22B-Thinking-FP8
Text Generation
• 236B • Updated • 30