Michael

electrolanche

AI & ML interests

None yet

Recent Activity

repliedto danielhanchen's post about 14 hours ago

1-bit GLM-5.2 GGUF vs. Claude 4.8 Opus vs. GPT-5.5 We gave 3 models the same prompt and compared one-shot outputs. The 1-bit GLM-5.2 GGUF ran locally on a Mac Studio M3 Ultra with 256GB RAM at ~21.6 tok/s. Which output do you like best? GGUF: https://huggingface.co/unsloth/GLM-5.2-GGUF

repliedto danielhanchen's post about 14 hours ago

new activity 4 days ago

unsloth/gemma-4-E4B-it:What is the difference between this model and the original google/gemma-4-E4B-it?

View all activity

Organizations

None yet

replied to danielhanchen's post about 14 hours ago

I took a look and there are a wide array of different precisions in the "1-bit" quants. @danielhanchen , how can a "1-bit" model contain layers at Q8_0 and F32? Does 1-bit only refer to the quantization of the ffn* layers? In that case, what is the average precision of the entire "1-bit" model?

replied to danielhanchen's post about 14 hours ago

Sample size n=1...
But seriously, it's impressive that a binary tree can output code that even executes. And at 1-bit there is no dynamic quantization (by definition), can't go lower than 1-bit!

New activity in unsloth/gemma-4-E4B-it 4 days ago

What is the difference between this model and the original google/gemma-4-E4B-it?

👀 1

#1 opened about 1 month ago by

adel-cybral

replied to unmodeled-tyler's post 9 months ago

I’m curious to see how it compares against it’s base model, Qwen3 8B.

updated 3 collections over 1 year ago

Michael

AI & ML interests

Recent Activity

Organizations

electrolanche's activity

What is the difference between this model and the original google/gemma-4-E4B-it?