Данная модель была получена квантизацией RefalMachine/RuadaptQwen3-32B-Instruct через библиотеку autogptq на датасете pomelk1n/RuadaptQwen-Quantization-Dataset

Почему AWQ, а не GGUF?

На 09-06-2025 Qwen3 с квантизацией gguf не поддерживается в vLLM. FP8 квантизации же не работают с tensor parallelism = 4, из-за чего была выбрана точность 4bit

TODO

Прогнать модель на бенчмарках
Сделать GPTQ версию

Downloads last month: 122

Safetensors

Model size

33B params

Tensor type

I32

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for qdzzzxc/RuadaptQwen3-32B-Instruct-AWQ

Base model

Qwen/Qwen3-32B

Finetuned

RefalMachine/RuadaptQwen3-32B-Instruct

Quantized

(3)

this model

qdzzzxc
/

RuadaptQwen3-32B-Instruct-AWQ

Почему AWQ, а не GGUF?

TODO

Model tree for qdzzzxc/RuadaptQwen3-32B-Instruct-AWQ

Dataset used to train qdzzzxc/RuadaptQwen3-32B-Instruct-AWQ