plz
qenme
·
AI & ML interests
None yet
Recent Activity
new activity about 22 hours ago
cyankiwi/Qwen3.6-27B-AWQ-BF16-INT8:Compatible with vLLM on Ampere? new activity 13 days ago
RedHatAI/gemma-4-31B-it-FP8-Dynamic:Can you add more details to the model card? new activity 21 days ago
google/gemma-4-31B-it-qat-w4a16-ct:26b with w416a-ctOrganizations
None yet
Compatible with vLLM on Ampere?
➕ 4
7
#3 opened about 2 months ago
by
HenkTenk
Can you add more details to the model card?
6
#1 opened 3 months ago
by
bullerwins
26b with w416a-ct
1
#3 opened 22 days ago
by
meganoob1337
Gemma team in the back making shrimp fried rice!!!!
🔥 1
#1 opened 23 days ago
by
qenme
KL divergence benchmark
2
#3 opened 24 days ago
by
qenme
Tokenizer problems, or just quants?
2
#105 opened about 2 months ago
by
nesymerp1
Me again
3
#2 opened about 1 month ago
by
qenme
MTP support?
👍 1
8
#5 opened about 1 month ago
by
Nindaleth
Very bad results with model quant and KV cache quant, only BF16 works well
👍👀 5
4
#34 opened 2 months ago
by
qenme
F16 or BF16?
5
#6 opened about 1 month ago
by
qenme
FYI : --spec-type mtp syntax has changed to --spec-type draft-mtp
👍 3
3
#14 opened about 2 months ago
by
qenme
presence-penalty
5
#8 opened about 2 months ago
by
owao
Good quant!
12
#1 opened about 2 months ago
by
qenme
Working good on 96GB VRAM + DDR5 Setup
❤️ 1
5
#2 opened about 2 months ago
by
phakio
GOOLE WHERE IS MTP ?
🔥 2
2
#82 opened 2 months ago
by
EvilinaMaller
10/10
🔥 6
1
#4 opened 2 months ago
by
qenme
thanks!
🤗❤️ 19
2
#1 opened 2 months ago
by
qenme
Will there be a small model for speculative decoding?
3
#71 opened 2 months ago
by
Regrin
Thanks
➕ 8
1
#2 opened 4 months ago
by
qenme