Request quantization model
1
#6 opened 23 days ago
by
win10
cost estimates?
4
#5 opened 23 days ago
by
lightenup
is NVFP4 supported on sm120 (blackwell rtx pro 6000, rtx 5090 etc)?
10
#4 opened 24 days ago
by
Fernanda24
works with vLLM, with FLASHINFER_MOE_FP4
1
#2 opened 27 days ago
by
bnjmnmarie
Can you redece the size from 62 GB to about 35-40 GB range in 4bit or lesser?
4
#1 opened 27 days ago
by
Prompt48