Is it possible for vllm (or other) to run this NVFP4 on RTX 5090 with 128GB RAM?
#6 opened 29 days ago
by
vladulidlo
Why Your NVFP4 Model Is Slower Than FP8 on the GB10 (NVIDIA Spark) — And How to Fix It
👍 2
2
#5 opened about 1 month ago
by
scottgl
Model requests?
12
#4 opened about 1 month ago
by
pathosethoslogos
MMLU PRO Benchmark
3
#3 opened about 2 months ago
by
sevapru
vLLM 0.16?
1
#2 opened about 2 months ago
by
MMaxHugg