tzervas/qwen2.5-coder-32b-bitnet-1.58b · Some questions on BitNet PTQ

Some questions on BitNet PTQ

by TomLucidor - opened Jan 30

Discussion

TomLucidor

Jan 30

A lot of repos these days still does not have that much support for BitNet (I think), how can this be run? https://github.com/vllm-project/vllm/issues/33142
Will you ever try BitNet for Qwen3-30B-A3B or GPT-OSS (MoE-style)? What about Nemotron-3-Nano or Ring-Mini-Linear-2.0 or Kimi-Linear (hybird attention)?
How could Sherry techniques be wrapped under BitNet/ternary quantization?
Could these models be benchmarked for their reasoning, agentic coding, and instruction-following abilities?

Cross-ref https://huggingface.co/nightmedia/Kimi-Linear-REAP-35B-A3B-Instruct-mxfp4-mlx/discussions/1

tzervas

Owner Mar 5

I will have to research some of this. I am unfortunately a hobbyist and self taught, so I have a lot of knowledge gaps. There were issues with initial quant that I've identified and I am working to patch these out and will reupload in place once resolved. I will work, once these quant issues are sorted, to ensure it extends to other model types and various implementations. I will have to follow up on the rest of your questions, but I do plan longer term to benchmark the model and others that I work on and get ternary quantized.

TomLucidor

Mar 6

Please start with Qwen3.5 if possible as they did one last banger. Looking forward to GitHub repos for quantizing/running this as well.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment