Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

GadflyII
/
Qwen3-Coder-Next-NVFP4

Text Generation
Transformers
Safetensors
qwen3_next
qwen3
Mixture of Experts
nvfp4
quantized
llmcompressor
vllm
conversational
compressed-tensors
Model card Files Files and versions
xet
Community
6
New discussion
Resources
  • PR & discussions documentation
  • Code of Conduct
  • Hub documentation

Is it possible for vllm (or other) to run this NVFP4 on RTX 5090 with 128GB RAM?

#6 opened 29 days ago by
vladulidlo

Why Your NVFP4 Model Is Slower Than FP8 on the GB10 (NVIDIA Spark) — And How to Fix It

👍 2
2
#5 opened about 1 month ago by
scottgl

Model requests?

12
#4 opened about 1 month ago by
pathosethoslogos

MMLU PRO Benchmark

3
#3 opened about 2 months ago by
sevapru

vLLM 0.16?

1
#2 opened about 2 months ago by
MMaxHugg
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs