I took a look and there are a wide array of different precisions in the "1-bit" quants. @danielhanchen , how can a "1-bit" model contain layers at Q8_0 and F32? Does 1-bit only refer to the quantization of the ffn* layers? In that case, what is the average precision of the entire "1-bit" model?
Michael
electrolanche
Β·
AI & ML interests
None yet
Recent Activity
repliedto danielhanchen's post about 14 hours ago
1-bit GLM-5.2 GGUF vs. Claude 4.8 Opus vs. GPT-5.5
We gave 3 models the same prompt and compared one-shot outputs.
The 1-bit GLM-5.2 GGUF ran locally on a Mac Studio M3 Ultra with 256GB RAM at ~21.6 tok/s.
Which output do you like best?
GGUF: https://huggingface.co/unsloth/GLM-5.2-GGUF repliedto danielhanchen's post about 14 hours ago
1-bit GLM-5.2 GGUF vs. Claude 4.8 Opus vs. GPT-5.5
We gave 3 models the same prompt and compared one-shot outputs.
The 1-bit GLM-5.2 GGUF ran locally on a Mac Studio M3 Ultra with 256GB RAM at ~21.6 tok/s.
Which output do you like best?
GGUF: https://huggingface.co/unsloth/GLM-5.2-GGUFOrganizations
None yet