keypa
/

Qwen3-32B-gguf

Model card Files Files and versions

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Qwen3-32B – GGUF Quantized

This repo contains Qwen3-32B quantized weights for llama.cpp.

Included formats

Qwen3-32B-f16.gguf
Qwen3-32B-Q8_0.gguf
Qwen3-32B-Q6_K.gguf
Qwen3-32B-Q5_K_M.gguf
Qwen3-32B-Q4_K_M.gguf

Usage

./llama-cli -m Qwen3-32B-Q4_K_M.gguf -p "Hello"

Notes

quantized using llama.cpp
original model: https://huggingface.co/keypa/Qwen3-32B

Downloads last month: 10

GGUF

Model size

33B params

Architecture

qwen3

Hardware compatibility

Log In to add your hardware

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support