Qwen3 Merged GGUF

Available Files

qwen3_model_q8_0.gguf - 8-bit quantized (34.8GB) ✅
qwen3_model_q6_k.gguf - 6-bit quantized (26.9GB) - uploading
qwen3_model_f16.gguf.aa, .ab, .ac - F16 split into parts (65.5GB total) - uploading

The F16 model was split due to file size limits (50GB max per file). To reassemble after downloading all parts:

cat qwen3_model_f16.gguf.* > qwen3_model_f16.gguf

Compatible with llama.cpp, LM Studio, and other GGUF-supporting inference engines.

GGUF

Hardware compatibility

6-bit

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support