AutoRoundTest / README.md
maximg's picture
Update README.md
e8b80af verified
|
Raw
History Blame Contribute Delete
1.23 kB
---
license: apache-2.0
language:
- multilingual
base_model: Qwen/Qwen3.6-27B
tags:
- auto-round
- intel
- gguf
- quantization
---
# Qwen3.6-27B GGUF (AutoRound Quantized, MTP Enabled)
This repository contains GGUF quantized versions of [Qwen/Qwen3.6-27B](https://huggingface.co/Qwen/Qwen3.6-27B) created using Intel's [AutoRound](https://github.com/intel/auto-round) quantization method.
## Quantization Details
The models were generated using Intel's AutoRound using ultrachat_200k as the test dataset and using sequence length of 2850. MTP layers were not explicitly enabled, but it works with MTP for me
```bash
auto-round \
--model Qwen/Qwen3.6-27B \
--output_dir ./quantized/ \
--scheme <SCHEME> \
--format <SCHEME> \
--iters 0 \
--nsamples 256 --seqlen 2850 --dataset "HuggingFaceH4/ultrachat_200k"
```
For now, only 2 quantization variants were used Q5_K_M and Q4_K_MIXED. Q4_K_MIXED is a custom variant based on Intel's original Q2_K_MIXED quantization, but using Q4_K quants instead of Q2.
### Files and Sizes
| File Name | Quant Type | Size |
|-----------|------------|------|
| `Qwen3.6-27B-Q2_K_MIXED.gguf` | Q2_K_MIXED | 16.5 GB |
| `Qwen3.6-27B-Q5_K_M.gguf` | Q5_K_M | 19 GB |