wenhua cheng
wenhuach
AI & ML interests
Model Compression, CV
Recent Activity
new activity 8 days ago
Intel/gemma-4-31B-it-int4-AutoRound:Please update chat template updated a model 8 days ago
Intel/gemma-4-31B-it-int4-AutoRound new activity 8 days ago
Intel/gemma-4-31B-it-int4-AutoRound:FP4?Organizations
Please update chat template
2
#4 opened 8 days ago
by
alexcardo
Why delete Intel/Qwen3.6-35B-A3B-int4-AutoRound?
3
#1 opened 14 days ago
by
bgeneto
does this even run on intel gpus?
4
#2 opened 10 days ago
by
Thomas98519864
AutoRound quant fails to load with mlx-lm
π 1
1
#1 opened 12 days ago
by
smcleod
How does this compare to the original 8bit qwen quant and the 4 bit auto-round quant?
2
#5 opened 21 days ago
by
sparx3
any plan for an Ampere compatible version?
2
#2 opened 17 days ago
by
electroglyph
Fails to load on Ampere (sm_86) at TP=2: Marlin kernel rejects 32-dim weight slice
2
#3 opened 19 days ago
by
wasifb
MTP 0 accept rate
2
#4 opened 25 days ago
by
AMUN-RA1
Installation Video and Testing - Step by Step
π 3
5
#1 opened 28 days ago
by
fahdmirzac
GGUF version
π₯ 1
1
#1 opened 29 days ago
by
limcheekin
Performance indicators
π 3
4
#1 opened about 2 months ago
by
dehnhaide
This model always predicts some few nonsense sequences
8
#1 opened 2 months ago
by
CharlesChen2023
Does the A100 work?
12
#1 opened 2 months ago
by
xz123321
Thanks! And MTP key question
11
#1 opened 2 months ago
by
seanthomaswilliams
Convert to gguf-q2ks-mixed-AutoRound?
π₯ 2
4
#2 opened 4 months ago
by
limcheekin
Qwen/Qwen3-Next-80B-A3B-Thinking has MMLU_PRO 82.7 but you guys get 0.7271
3
#2 opened 8 months ago
by
hlxxxxxx
AutoRound request: GLM-4.5-Air
1
#1 opened 4 months ago
by
babytifa
2507 Thinking model release
11
#4 opened 7 months ago
by
anjeysapkovski