LGxNDs commited on
Commit
6b42c79
·
verified ·
1 Parent(s): 55d6ba1

Upload 2 files

Browse files

This repository contains a Qwen3.6 model quantized using IQ2_M 2-bit quantization via the GGUF format. The IQ2_M scheme is part of the Intelligent Quants (IQ) family, designed to deliver extreme low-bit precision while preserving model quality through mixed-precision techniques.

**Model Overview:**
Qwen3.6 represents a series of large language models featuring advanced architecture optimizations including grouped query attention (GQA), sliding window attention, and high-quality tokenization for efficient training and inference. These models are designed to deliver strong reasoning capabilities across diverse tasks while maintaining computational efficiency through architectural innovations like multi-token prediction and MoE-style routing.

**IQ2_M Quantization Process:**
IQ2_M is a 2-bit quantization scheme that combines multiple sub-precision levels (2 bits per weight) using intelligent bit allocation strategies. It employs mixed-precision where different weights receive varying bit allocations based on their sensitivity, with critical parameters preserved in higher precision while less important weights are packed into minimal bit formats. The "M" variant specifically balances quality retention with memory efficiency through optimized scaling factors and block-wise quantization, achieving significant model size reduction while maintaining usable performance for practical applications.

.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ Qwen3.6-GeekedOutAi-35B-A3B-BF16-IQ2_M-00001-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
37
+ Qwen3.6-GeekedOutAi-35B-A3B-BF16-IQ2_M-00002-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
Qwen3.6-GeekedOutAi-35B-A3B-BF16-IQ2_M-00001-of-00002.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c0681e394d4ad4e815297b2388307134573c1a3dd2a099e0b96b48336b4fcec9
3
+ size 8302924576
Qwen3.6-GeekedOutAi-35B-A3B-BF16-IQ2_M-00002-of-00002.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3b9ac80f9a243730299dc337795df3571b206caf8ca8ce3b49b1e6000a092345
3
+ size 3356312000