Installation Video and Testing - Step by Step
#13 opened about 1 hour ago
by
fahdmirzac
llama.cpp inference - 20 times (!) slower than OSS 20 on a RTX 5090
7
#12 opened about 6 hours ago
by
cmp-nct
We are so back!
❤️
2
#10 opened about 7 hours ago
by
Carnyzzle
The adaptation for SGLang is being processed.
#9 opened about 8 hours ago
by
ZHANGYUXUAN-zR
Is a dedicated Tech Report planned for GLM-4.7-Flash?
1
#8 opened about 8 hours ago
by
NodeLinker
FP8
3
#7 opened about 8 hours ago
by
Daemontatox
Recommended sampling parameters
2
#6 opened about 8 hours ago
by
sszymczyk
DeepseekV3ForCausalLM
🔥
2
1
#5 opened about 8 hours ago
by
davidboring
Thank you!
🔥
11
#4 opened about 9 hours ago
by
mav23
Enormous KV-cache size?
👍
➕
2
5
#3 opened about 9 hours ago
by
nephepritou
Base model
🔥
5
1
#2 opened about 9 hours ago
by
tcpmux
Performance Discussion
2
#1 opened about 9 hours ago
by
IndenScale