Henry's picture

Henry

hdnh2006

·

https://henrynavarro.org

hdnh2006

AI & ML interests

Math, data & AI.

Recent Activity

new activity 21 days ago

RedHatAI/Qwen3.6-35B-A3B-NVFP4:MTP is slower than the "normal" serve

updated a model 28 days ago

NeuralNet-Hub/Qwen3.6-27B-NVFP4

new activity 28 days ago

NeuralNet-Hub/Qwen3.6-27B-NVFP4:NVFP4 on RTX 5090: 120k Context & 8-bit KV Cache Feasibility

View all activity

Organizations

New activity in RedHatAI/Qwen3.6-35B-A3B-NVFP4 21 days ago

MTP is slower than the "normal" serve

#10 opened about 1 month ago by

New activity in NeuralNet-Hub/Qwen3.6-27B-NVFP4 28 days ago

NVFP4 on RTX 5090: 120k Context & 8-bit KV Cache Feasibility

#1 opened about 1 month ago by

New activity in wangzhang/gemma-4-31B-it-abliterated 29 days ago

processor_config.json file upload for smooth integration with vLLM

#5 opened 29 days ago by

New activity in wangzhang/gemma-4-26B-A4B-it-abliterix 29 days ago

Recommend to upload process_config.json from original model

#7 opened 29 days ago by

This PR uploads the process_config.json file allowing to serve the model using vLLM without issues

#8 opened 29 days ago by

New activity in cyberneurova/CyberNeurova-DeepSeek-V4-Flash-abliterated-GGUF about 1 month ago

llama.cpp version?

#5 opened about 1 month ago by

New activity in DavidAU/Qwen3.6-40B-Claude-4.6-Opus-Deckard-Heretic-Uncensored-Thinking about 1 month ago

vLLM version? I am getting error with some layers

#6 opened about 1 month ago by

New activity in AEON-7/Gemma-4-31B-it-DECKARD-HERETIC-Uncensored-NVFP4 about 1 month ago

After applying patch, the model is unable to serve

#3 opened about 1 month ago by

New activity in AEON-7/Gemma-4-26B-A4B-it-Uncensored-NVFP4 about 1 month ago

This model is not uncensored at all

#2 opened about 1 month ago by

New activity in unsloth/Qwen3.6-27B-NVFP4 about 1 month ago

Why is it too big?

#1 opened about 2 months ago by

MTP?

#3 opened about 1 month ago by

Impossible to run in a 5090 with any context window, no even 8192.

#2 opened about 2 months ago by

New activity in Jiunsong/supergemma4-26b-abliterated-multimodal about 1 month ago

vLLM is unable to deploy this model

#3 opened about 1 month ago by

New activity in DreamFast/Qwen3.6-27B-Uncensored-HauhauCS-Aggressive-Safetensor-Benchmark about 1 month ago

NVFP4 version?

#3 opened about 1 month ago by

New activity in DreamFast/Qwen3.5-27B-Uncensored-HauhauCS-Aggressive-Safetensor-Benchmark about 1 month ago

Include visual layers in `model.safetensors.index.json `

#1 opened about 1 month ago by

New activity in RedHatAI/gemma-4-26B-A4B-it-NVFP4 about 2 months ago

Impossible to deploy using vLLM 0.20.0

#6 opened about 2 months ago by

New activity in HauhauCS/Qwen3.6-27B-Uncensored-HauhauCS-Aggressive 2 months ago

safetensor?

#4 opened 2 months ago by

New activity in Qwen/Qwen3.5-35B-A3B-GPTQ-Int4 4 months ago

Working vLLM setup on RTX 5090 — 194-197 tok/s with image/video

#3 opened 4 months ago by

New activity in unsloth/Qwen3.5-35B-A3B-GGUF 4 months ago

Is there some helpful regex to offload all MoE layers to the CPU?

#7 opened 4 months ago by

New activity in zai-org/GLM-4.7-Flash 5 months ago

Off reasoning

#56 opened 5 months ago by