wenhua cheng
wenhuach
AI & ML interests
Model Compression, CV
Recent Activity
new activity
1 day ago
Intel/GLM-5-int4-mixed-AutoRound:This model always predicts some few nonsense sequences liked
a model 2 days ago
Intel/Step-3.5-Flash-int4-mixed-AutoRound new activity
3 days ago
Intel/Qwen3.5-122B-A10B-int4-AutoRound:Does the A100 work? Organizations
This model always predicts some few nonsense sequences
5
#1 opened 14 days ago
by
CharlesChen2023
Does the A100 work?
7
#1 opened 15 days ago
by
xz123321
Thanks! And MTP key question
10
#1 opened 11 days ago
by
seanthomaswilliams
Convert to gguf-q2ks-mixed-AutoRound?
π₯ 2
4
#2 opened about 2 months ago
by
limcheekin
Qwen/Qwen3-Next-80B-A3B-Thinking has MMLU_PRO 82.7 but you guys get 0.7271
3
#2 opened 6 months ago
by
hlxxxxxx
AutoRound request: GLM-4.5-Air
1
#1 opened 2 months ago
by
babytifa
2507 Thinking model release
11
#4 opened 5 months ago
by
anjeysapkovski
How to use this kernel
#1 opened 2 months ago
by
wenhuach
Thinking version has been deleted?
1
#2 opened 3 months ago
by
reswewr
Improve model card: Add pipeline tag, library name, and update paper/citation
π₯ 1
#1 opened 3 months ago
by
nielsr
Could we get more w2a16 w3a16 and w4a16 Autoround
π 1
1
#1 opened 4 months ago
by
twhitworth
Practical performance feedback
1
#2 opened 4 months ago
by
maigonis
Works good with vLLM, just no tool calling
1
#1 opened 7 months ago
by
Ununnilium
Inference with llama.cpp + Open WebUI gives repeating `?`
4
#1 opened 5 months ago
by
whoisjeremylam
Adding `transformers` as the library tag
#3 opened 6 months ago
by
ariG23498
CPU only?
4
#2 opened 6 months ago
by
jujutechnology
Error loading in vLLm 10.1+
π 1
2
#2 opened 6 months ago
by
freegheist
Bits for embedding / lm-head / non expert layers
1
#1 opened 7 months ago
by
sokann
Awesome Any Chance for this for GLM-4.5
2
#1 opened 7 months ago
by
Fernanda24