Un-quantize MTP block (revert to native precision) 9e7e88e verified jingyux-nv commited on 2 days ago
Drop FlashInfer MoE env vars from vLLM deploy command b9e53ec verified jingyux-nv commited on 4 days ago
Rename TensorRT Model Optimizer link to Model Optimizer (NVIDIA/Model-Optimizer) 994894a verified jingyux-nv commited on 4 days ago
Update README: replace eval table (GPQA/AA-LCR/τ²-Bench/SciCode/IFBench); switch runtime to SGLang and vLLM 2e68f25 verified jingyux-nv commited on 4 days ago