44 10 34

wenhua cheng

wenhuach

wenhuach21

AI & ML interests

Model Compression, CV

Recent Activity

new activity 8 days ago

Intel/gemma-4-31B-it-int4-AutoRound:Please update chat template

updated a model 8 days ago

Intel/gemma-4-31B-it-int4-AutoRound

new activity 8 days ago

Intel/gemma-4-31B-it-int4-AutoRound:FP4?

View all activity

Organizations

New activity in Intel/gemma-4-31B-it-int4-AutoRound 8 days ago

Please update chat template

#4 opened 8 days ago by

alexcardo

FP4?

#5 opened 8 days ago by

alexcardo

New activity in Intel/Qwen3.6-27B-int4-AutoRound 9 days ago

Why delete Intel/Qwen3.6-35B-A3B-int4-AutoRound?

#1 opened 15 days ago by

bgeneto

New activity in Intel/Qwen3.6-27B-int4-AutoRound 10 days ago

does this even run on intel gpus?

#2 opened 10 days ago by

Thomas98519864

New activity in Intel/Qwen3.6-27B-4.5b-mlx-AutoRound 12 days ago

AutoRound quant fails to load with mlx-lm

👍 1

#1 opened 12 days ago by

smcleod

New activity in cyankiwi/Qwen3.6-35B-A3B-AWQ-4bit 16 days ago

How does this compare to the original 8bit qwen quant and the 4 bit auto-round quant?

#5 opened 21 days ago by

sparx3

New activity in Intel/gemma-4-26B-A4B-it-int4-mixed-AutoRound 17 days ago

any plan for an Ampere compatible version?

#2 opened 17 days ago by

electroglyph

New activity in Intel/gemma-4-31B-it-int4-AutoRound 19 days ago

Fails to load on Ampere (sm_86) at TP=2: Marlin kernel rejects 32-dim weight slice

#3 opened 19 days ago by

wasifb

New activity in Intel/GLM-4.7-Flash-int4-AutoRound 25 days ago

MTP 0 accept rate

#4 opened 25 days ago by

AMUN-RA1

New activity in Intel/gemma-4-31B-it-int4-AutoRound 28 days ago

Installation Video and Testing - Step by Step

🚀 3

#1 opened 29 days ago by

fahdmirzac

New activity in Intel/gemma-4-26B-A4B-it-int4-mixed-AutoRound 28 days ago

GGUF version

🔥 1

#1 opened 29 days ago by

limcheekin

New activity in Intel/Qwen3.5-397B-A17B-gguf-q2ks-mixed-AutoRound about 2 months ago

Performance indicators

👍 3

#1 opened about 2 months ago by

dehnhaide

New activity in Intel/GLM-5-int4-mixed-AutoRound about 2 months ago

This model always predicts some few nonsense sequences

#1 opened 2 months ago by

CharlesChen2023

New activity in Intel/Qwen3.5-122B-A10B-int4-AutoRound about 2 months ago

Does the A100 work?

#1 opened 2 months ago by

xz123321

New activity in Intel/Qwen3.5-35B-A3B-int4-AutoRound 2 months ago

Thanks! And MTP key question

#1 opened 2 months ago by

seanthomaswilliams

New activity in Intel/GLM-5-int4-mixed-AutoRound 2 months ago

vLLM fails to serve Intel/GLM-5-int4-mixed-AutoRound on NVIDIA DGX Spark (GB10, sm121) due to no valid MLA attention backend (qk_nope_head_dim 192)

#2 opened 2 months ago by

oliverjohnwilson

New activity in Intel/GLM-4.7-Flash-int4-AutoRound 3 months ago

Convert to gguf-q2ks-mixed-AutoRound?

🔥 2

#2 opened 4 months ago by

limcheekin

New activity in Intel/Qwen3-Next-80B-A3B-Thinking-int4-AutoRound 3 months ago

Qwen/Qwen3-Next-80B-A3B-Thinking has MMLU_PRO 82.7 but you guys get 0.7271

#2 opened 8 months ago by

hlxxxxxx

New activity in Intel/GLM-4.7-int4-mixed-AutoRound 4 months ago

AutoRound request: GLM-4.5-Air

#1 opened 4 months ago by

babytifa

New activity in Intel/Qwen3-30B-A3B-Instruct-2507-gguf-q2ks-mixed-AutoRound 4 months ago

2507 Thinking model release

#4 opened 7 months ago by

anjeysapkovski

wenhua cheng

AI & ML interests

Recent Activity

Organizations

wenhuach's activity

Please update chat template

FP4?

Why delete Intel/Qwen3.6-35B-A3B-int4-AutoRound?

does this even run on intel gpus?

AutoRound quant fails to load with mlx-lm

How does this compare to the original 8bit qwen quant and the 4 bit auto-round quant?

any plan for an Ampere compatible version?

Fails to load on Ampere (sm_86) at TP=2: Marlin kernel rejects 32-dim weight slice

MTP 0 accept rate

Installation Video and Testing - Step by Step

GGUF version

Performance indicators

This model always predicts some few nonsense sequences

Does the A100 work?

Thanks! And MTP key question

vLLM fails to serve Intel/GLM-5-int4-mixed-AutoRound on NVIDIA DGX Spark (GB10, sm121) due to no valid MLA attention backend (qk_nope_head_dim 192)

Convert to gguf-q2ks-mixed-AutoRound?

Qwen/Qwen3-Next-80B-A3B-Thinking has MMLU_PRO 82.7 but you guys get 0.7271

AutoRound request: GLM-4.5-Air

2507 Thinking model release