Henry
hdnh2006
AI & ML interests
Math, data & AI.
Recent Activity
new activity 21 days ago
RedHatAI/Qwen3.6-35B-A3B-NVFP4:MTP is slower than the "normal" serve updated a model 28 days ago
NeuralNet-Hub/Qwen3.6-27B-NVFP4 new activity 28 days ago
NeuralNet-Hub/Qwen3.6-27B-NVFP4:NVFP4 on RTX 5090: 120k Context & 8-bit KV Cache FeasibilityOrganizations
MTP is slower than the "normal" serve
4
#10 opened about 1 month ago
by
hdnh2006
NVFP4 on RTX 5090: 120k Context & 8-bit KV Cache Feasibility
1
#1 opened about 1 month ago
by
nsfilho
processor_config.json file upload for smooth integration with vLLM
#5 opened 29 days ago
by
hdnh2006
Recommend to upload process_config.json from original model
1
#7 opened 29 days ago
by
hdnh2006
llama.cpp version?
1
#5 opened about 1 month ago
by
hdnh2006
New activity in DavidAU/Qwen3.6-40B-Claude-4.6-Opus-Deckard-Heretic-Uncensored-Thinking about 1 month ago
vLLM version? I am getting error with some layers
2
#6 opened about 1 month ago
by
hdnh2006
After applying patch, the model is unable to serve
#3 opened about 1 month ago
by
hdnh2006
This model is not uncensored at all
1
#2 opened about 1 month ago
by
hdnh2006
Why is it too big?
➕ 3
9
#1 opened about 2 months ago
by
alexcardo
MTP?
➕ 4
3
#3 opened about 1 month ago
by
tyapo
Impossible to run in a 5090 with any context window, no even 8192.
2
#2 opened about 1 month ago
by
hdnh2006
vLLM is unable to deploy this model
#3 opened about 1 month ago
by
hdnh2006
New activity in DreamFast/Qwen3.6-27B-Uncensored-HauhauCS-Aggressive-Safetensor-Benchmark about 1 month ago
NVFP4 version?
3
#3 opened about 1 month ago
by
hdnh2006
New activity in DreamFast/Qwen3.5-27B-Uncensored-HauhauCS-Aggressive-Safetensor-Benchmark about 1 month ago
Include visual layers in `model.safetensors.index.json `
#1 opened about 1 month ago
by
hdnh2006
Impossible to deploy using vLLM 0.20.0
2
#6 opened about 2 months ago
by
hdnh2006
safetensor?
👀 3
10
#4 opened 2 months ago
by
MRU4913
Working vLLM setup on RTX 5090 — 194-197 tok/s with image/video
👍🚀 2
5
#3 opened 4 months ago
by
8055izham
Is there some helpful regex to offload all MoE layers to the CPU?
4
#7 opened 4 months ago
by
hdnh2006
Off reasoning
👀 2
4
#56 opened 5 months ago
by
kmahdi