0xSero PRO
0xSero
AI & ML interests
Quantizing, benchmarking, training, and building.
Recent Activity
updated a model about 1 hour ago
0xSero/GLM-5.2-504B-Nvidia published a model about 1 hour ago
0xSero/GLM-5.2-504B-Nvidia updated a model 1 day ago
0xSero/GLM-5.2-504BOrganizations
Axis 4 reasoning/termination: 3,057 -- Can you share this calibration data?
3
#2 opened 2 days ago
by
0xSero
Korean Multilingual is broken.
4
#6 opened 18 days ago
by
DFveloper
Starting GPTQ model on H200 fails
3
#2 opened 29 days ago
by
hjjg85
error with vLLM
6
#3 opened about 1 month ago
by
bnjmnmarie
Running models with vLLM on the RTX Pro 6000 - SM120
👀👍 2
11
#28 opened about 2 months ago
by
liku2001
What kind of token per second can we expect if running Windows and llama.cpp on Framework Desktop?
2
#2 opened about 2 months ago
by
Clauscdotcom
Thank you 🙏
4
#1 opened 2 months ago
by
BlueNipples
Special Token Disaster: Your Tech Lead Has Zero Design Taste
👀🔥 3
4
#6 opened 2 months ago
by
ytgui
How are you running this?
5
#1 opened 2 months ago
by
richardhundt
use chat template here or the new one from google?
1
#5 opened 2 months ago
by
Grandys
Repetition loops with llama.cpp defaults
1
#2 opened 3 months ago
by
todaymare
"vLLM Launch Failed - Kernel Incompatibility with RTX 3090"
5
#3 opened 4 months ago
by
steppi
Add recovery report for VLM fix
#1 opened 4 months ago
by
0xSero
VLM recovery: multimodal config/index + clean vision encoder + manifest
#2 opened 4 months ago
by
0xSero
Love it - Future suggestion
❤️ 1
1
#1 opened 4 months ago
by
0xSero
Might be doing something wrong serving with vllm
1
#4 opened 6 months ago
by
yuchenxie
Quantization method
#1 opened 7 months ago
by
0xSero
We need 50 or 60% expert pruning please
4
#3 opened 7 months ago
by
hxssgaa
FP8 Please?
👍 2
1
#1 opened 7 months ago
by
0xSero
Interleaved Thinking, minimax:tool_call parsing
👍 1
1
#29 opened 8 months ago
by
0xSero